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RESUME 



La theorie de l'information quantique etudie les limites fondamentales qu'im- 
posent les lois de la physique sur les taches de traitement de donnees comme la 
compression et la transmission de donnees sur un canal bruite. Cette these pre- 
sente des techniques generales permettant de resoudre plusieurs problemes f on- 
damentaux de la theorie de l'information quantique dans un seul et meme cadre. 
Le theoreme central de cette these enonce l'existence d'un protocole permettant 
de transmettre des donnees quantiques que le receveur connait deja partielle- 
ment a l'aide d'une seule utilisation d'un canal quantique bruite. Ce theoreme a 
de plus comme corollaires immediats plusieurs theoremes centraux de la theorie 
de l'information quantique. 

Les chapitres suivants utilisent ce theoreme pour prouver l'existence de nou- 
veaux protocoles pour deux autres types de canaux quantiques, soit les ca- 
naux de diffusion quantiques et les canaux quantiques avec information supple- 
mentaire fournie au transmetteur. Ces protocoles traitent aussi de la transmis- 
sion de donnees quantiques partiellement connues du receveur a l'aide d'une 
seule utilisation du canal, et ont comme corollaires des versions asymptotiques 
avec et sans intrication auxiliaire. Les versions asymptotiques avec intrication 
auxiliaire peuvent, dans les deux cas, etre considerees comme des versions 
quantiques des meilleurs theoremes de codage connus pour les versions clas- 
siques de ces problemes. 

Le dernier chapitre traite d'un phenomene purement quantique appele ver- 
rouillage : il est possible d'encoder un message classique dans un etat quantique 
de sorte qu'en lui enlevant un sous-systeme de taille logarithmique par rapport 
a sa taille totale, on puisse s'assurer qu'aucune mesure ne puisse avoir de corre- 
lation significative avec le message. Le message se trouve done « verrouille » par 
une cle de taille logarithmique. Cette these presente le premier protocole de ver- 
rouillage dont le critere de succes est que la distance trace entre la distribution 
jointe du message et du resultat de la mesure et le produit de leur marginales 
soit suffisamment petite. 

Mots cles: Theorie de l'information, information quantique 



ABSTRACT 

Quantum information theory studies the fundamental limits that physical 
laws impose on information processing tasks such as data compression and data 
transmission on noisy channels. This thesis presents general techniques that al- 
low one to solve many fundamental problems of quantum information theory 
in a unified framework. The central theorem of this thesis proves the existence 
of a protocol that transmits quantum data that is partially known to the receiver 
through a single use of an arbitrary noisy quantum channel. In addition to the 
intrinsic interest of this problem, this theorem has as immediate corollaries sev- 
eral central theorems of quantum information theory. 

The following chapters use this theorem to prove the existence of new pro- 
tocols for two other types of quantum channels, namely quantum broadcast 
channels and quantum channels with side information at the transmitter. These 
protocols also involve sending quantum information partially known by the re- 
ceiver with a single use of the channel, and have as corollaries entanglement- 
assisted and unassisted asymptotic coding theorems. The entanglement-assisted 
asymptotic versions can, in both cases, be considered as quantum versions of the 
best coding theorems known for the classical versions of these problems. 

The last chapter deals with a purely quantum phenomenon called locking. 
We demonstrate that it is possible to encode a classical message into a quantum 
state such that, by removing a subsystem of logarithmic size with respect to its 
total size, no measurement can have significant correlations with the message. 
The message is therefore "locked" by a logarithmic-size key. This thesis presents 
the first locking protocol for which the success criterion is that the trace distance 
between the joint distribution of the message and the measurement result and 
the product of their marginals be sufficiently small. 

Keywords: Information theory, quantum information 
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NOTATION 



General 



log 


Logarithm base 2. 


In 

R 


Natural logarithm. 
Real numbers. 


C 


Complex numbers. 


c* 


Complex conjugate of c. 


Mf(u)} 


Expectation value of f(U) over the random variable 
U. 


Linear Algebra and Quantum Systems 


A,B,C,... 


Labels for quantum systems, or linear operators be- 
tween Hilbert spaces. (Should be clear from context.) 


A,B,C,... 


Hilbert spaces associated with the systems A,B,C,... 


\A\ 


Dimension of A. 


AB 


Composite quantum system whose associated Hilbert 
space is A ® B. 


A n 


Quantum system composed of n copies of A. 


L(A, B) 


The space of linear operators from A to B 


L(A) 


L(A,A) 




Indicates that the operator M is in L(A, B). 


Aft 


Adjoint of M 


M^ B 


Transpose of M with respect to the canonical bases of 
A and B. This has lower priority than matrix multipli- 
cation: AB ■ C = (AB)C(ABY 


M • N 


MiVM 1 " 


Herm(A) 


The set of Hermitian operators from A to A 


Pos(A) 


The subset of Herm(A) consisting of positive semidef- 
inite matrices 


M 


If M, N e Herm(A), this means that N - M e Pos(A). 



Linear Algebra and Quantum Systems, continued 

D(A) The set of all density operators on A; i.e. D(A) = {p : 

p e Pos(A),Tr[p] = 1} 
J\f A ^ B , T A ^ B , . . . Superoperators (completely positive linear maps from 

L(A) toL(B)) 

I A Identity operator on A or identity superoperator on 

L(A). (Should be clear from context.) 

\ip) A , \<p) A , ■ ■ ■ Vectors in A. 

Lp A , . . . The "unketted" versions denote their associated den- 

sity matrices: i[) A = Furthermore, if we have 

defined a state ip AB , then ip A = Tr B [^ AB \. 

Pa^b(\^P) AB ) Turns a vector into an operator. See Section 2.6. 

vec(M A ~> B ) Turns an operator into a vector. See Section 2.6. 

a/M If M e Pos(A) has spectral decomposition M = 

then Vm = £\ 



|$) AA ' —j= Ei=i \i) A \i) A ' , where \i) A and \i) A ' are fixed 

canonical bases for A and A', and A = A'. 
tc a The maximally mixed state 1 



\A\ 



Norms and Distance measures 



M A ^ B \\, Tr \[M^M 



i 



2 

|M A ^ S || 2 y/Tr[MW] 

I M A ^ B 1 1 Largest singular value of M. 

|A/' A ^' B || < Diamond norm; see Section 2.3.1. 

F(p A , a A ) || - v /p- N /o r || . This is called the fidelity. 

d F (p A , a A ) a/1 — F(p, a) 2 . This is called the fidelity distance. 



XI 



Entropies 

H{A\B) p 

H2(A\B) P 
HI(A\B) p 

H min (A\B) p 
H m&x {A\B) p 
HlMB) p 

I{A)B) P 
I{A-B) p 
I{A-B\C) p 



Conditional von Neumann entropy of A given B on 
p AB , see Definition 2.4. 

Conditional 2-entropy of A given B, defined as 

- log miiVB^B) Tr <g) P 4 )" 1 / V s ) 2 ~ 

Smooth 2-entropy of A given B, defined as 

max a AB 4FMsie H 2 {A\B) C 

Conditional min-entropy, see Definition 2.10. 

Conditional max-entropy, see Definition 2.12. 

e-smooth conditional min-entropy see Definition 2.13. 

e-smooth conditional max-entropy see Definition 

2.15. 

Coherent information, see Definition 2.8. 
Mutual information, see Definition 2.6. 
Conditional mutual information, see Definition 2.7. 



First names 



Alice 
Bob 



The sender in all the protocols. 
The receiver in all the protocols. 
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CHAPTER 1 



INTRODUCTION 



The origins of information theory go back to 1948, when Claude Shannon 
published "A mathematical theory of communication" [Sha48], in which he pro- 
posed a mathematical framework to study information processing tasks such as 
data compression and data transmission over noisy channels. Data compression 
is the following task: we have a large amount of digital data, and we would like 
to shrink it down to a smaller size for efficient storage or transmission. If the 
data is sufficiently redundant, then it is possible to do this with a very small 
probability of decompressing it incorrectly. Data transmission over noisy chan- 
nels involves the following problem: one has a communication channel in which 
the transmitter can select an input and the receiver receives an output that has 
been corrupted by noise in the channel. A concrete example of this would be the 
phone line between a house and the telephone central, or the radio link between 
a cellphone tower and the handsets. One would then like to use this channel to 
send a message and make sure that, with high probability, the receiver will be 
able to reconstruct it exactly. 

Since our universe is governed by the laws of quantum mechanics, the phys- 
ical limits imposed on these problems are themselves quantum mechanical. It 
also turns out that information can behave in counterintuitive ways under the 
laws of quantum mechanics: for example, one can know precisely the state of a 
two-particle quantum system while remaining ignorant of the state of either of 
the two particles separately. Furthermore, measurements made on two particles 
that are kept very far apart can exhibit correlations that could not be explained 
classically without assuming that information was transmitted faster than the 
speed of light. This is why a quantum version of information theory is so inter- 
esting: it is our attempt at taming these apparent paradoxes and counterintuitive 
facts. In this thesis, we will be concerned specifically with coding for various 
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different types of quantum channels. In the last chapter, we will also look at 
the phenomenon of information locking, in which a small key can "unlock" an 
amount of information far beyond what would be possible classically 

1.1 Decoupling 

One of the most bizarre features of quantum information theory turns out 
to be extremely useful for solving channel coding problems. It is the notion of 
purification: given any quantum system A whose state is random, one can find a 
bigger system AB such that the state on A is the same as before, but where the 
global state on AB is completely deterministic. This is impossible classically: 
if the state of a system is random, considering it together with another system 
only adds the potential of having more randomness globally. This, however, 
will help us tremendously. In a channel coding problem, we want to ensure that 
the output of the channel is strongly correlated (or "coupled") with the input. 
When we look at the purification of the final state that we want between the 
input and output, however, it turns out that this is equivalent to requiring that 
the input to the channel be completely decorrelated with the entire universe 
minus the channel output. This helps us because we can achieve it by destroying 
correlations — and, as in other areas of life, destruction is easier to achieve than 
construction. 

This "decoupling" approach — we use the term "decouple" to mean 
"decorrelate" — has therefore become a staple of quantum information theory. 
It was already used to some extent in [Dev05], the first general coding theorem 
for the quantum capacity of quantum channels, and was used more systemati- 
cally in [HOW07] and [ADHW06], which derived basic quantum protocols from 
which a large number of other, previously known protocols could be derived. 
In [HHYW08], the results of [Dev05] were revisited using a "purer" decoupling 
approach. 

While we have some sense that these last three papers use the same "trick", 
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they are nonetheless proven separately, and while they can be used to derive 
other protocols, one sometimes needs to work quite a bit to accomodate the par- 
ticular forms of the theorems (using, for instance, typical projectors to limit the 
dimensions of various quantum systems). One of the main contributions of this 
thesis is to give a general decoupling theorem, from which all of the known 
ones can be derived very easily, and which is much more flexible. We then go 
on to give quantum coding theorems for different varieties of quantum chan- 
nels, including quantum broadcast channels and quantum channels with side 
information at the transmitter. In both cases, no prior results exist regarding the 
particular tasks considered. Finally, we also use the main decoupling theorem 
to prove a result on information locking. 

1.2 Contributions 

This thesis is broken down into the following chapters: 

Chapter 2 (Preliminaries): This chapter contains the concepts and defini- 
tions necessary to understand the rest of the thesis. It does not contain original 
material. 

Chapter 3 (The decoupling theorem): This chapter is devoted to the main 
decoupling theorem. We state it and prove it along with several variants, in- 
cluding a new one-shot coding theorem for quantum channels, in which Bob 
potentially knows part of the state before the start of the protocol. We then use it 
to rederive the main results of [HOW07], [ADHW06], [GPW05] and [HHYW08] 
in a more straightforward manner. The contents of this chapter will be published 
as a paper at a later date. 

Chapter 4 (Quantum channels with side information at the transmitter): 
This chapter derives new results on quantum channels with side information at 
the transmitter. A channel with side information at the transmitter is a channel 
in which the transmitter has access ahead of time to information about the noise 
in the channel, but where the receiver does not have access to this information. 
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We give a one-shot coding theorem for them similar to the one for regular chan- 
nels in Chapter 3, and show that applying it to entanglement-assisted coding 
for memoryless channels yields an optimal protocol. In particular, we show that 
the entanglement-assisted capacity of these channels admits a single-letter for- 
mula that parallels the solution to the classical version of this problem given in 
[GP80]. Part of the work in this section was presented in a different form at the 
2009 International Symposium on Information Theory [Dup09]. 

Chapter 5 (Quantum broadcast channels): This chapter contains a coding 
theorem for quantum broadcast channels, namely channels with one input but 
two outputs going to two physically separated receivers. Again, we give a 
general one-shot coding theorem, and we then derive from it an entanglement- 
assisted coding scheme for memoryless channels that parallels the best known 
classical coding theorem for broadcast channels given in [Mar79]. These are the 
first coding theorems given for these tasks. A different version of this work was 
accepted for publication in IEEE Transactions on Information Theory and is joint 
work with Patrick Hayden and Ke Li [DHL09]. 

Chapter 6 (Locking classical information in quantum states): This chapter 
deals with the purely quantum phenomenon of information locking. We show 
that there exists a unitary such that if we encode a classical message into a quan- 
tum state, apply this unitary to it, and remove a very small part (logarithmic 
in the total size), then one can get almost no information about the message by 
measuring the remaining part. This is done by showing that the statistical dis- 
tance between the joint distribution of the message and the measurement result 
and a product distribution can be made very small. This is slightly stronger than 
what was done in prior information locking results, in which upper bounds on 
the mutual information between the measurement result and the message were 
derived. Furthermore, this is the first locking protocol in which one uses a sin- 
gle unitary and a quantum key instead of applying one of several unitaries and 
using the choice of unitary as the key. We also show that this scheme can be 
used to construct a quantum key distribution protocol that guarantees that the 
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eavesdropper can gain almost no information about the key by making a mea- 
surement immediately after the execution of the protocol, but where the eaves- 
dropper only needs to learn a very small portion of the key to be able to recover 
the rest. This underscores much more spectacularly than before [KRBM07] the 
need to take into account the fact that an eavesdropper might keep quantum in- 
formation after the protocol and use it only when making his actual attack. This 
will be published at a later date and is joint work with Patrick Hayden and Deb- 
bie Leung. 

Chapter 7 (Conclusion): This chapter concludes the thesis with a recapitula- 
tion of what was done, and speculates on what the future might hold. 



CHAPTER 2 



PRELIMINARIES 

This chapter explains the notation used throughout the thesis and presents 
some concepts one needs to understand this document. 

2.1 Notation 

Linear algebra is the language of quantum mechanics; we therefore start by 
introducing the notation we will use for linear algebraic concepts. One can find 
explanations of all the concepts below in any linear algebra textbook, or, to be 
introduced to these concepts in the setting of quantum information, in [Wat08]. 
Note that a condensed version of this appears on pages ix-xi so the reader can 
refer back to it more easily. We will denote by sans-serif capital letters (such 
as A, B, ... ) complex finite-dimensional inner product vector spaces (which we 
will usually simply call Hilbert spaces following the usual quantum information 
convention — these are the only Hilbert spaces that we will ever consider in this 
thesis), and we will use regular capital letters A, B, . . . to label the quantum sys- 
tems associated with the spaces A, B, We will denote the dimension of A by 

\A\. Vectors in A are denoted by "kets" \ip) A with the superscript omitted when it 
causes no confusion. Furthermore, we will denote by L(A, B) the space of linear 
operators from A to B, and we will use the shorthand L(A) for L(A, A). Elements 
of the dual space L(A, C) of A are written as "bras"; for instance the dual of 
is written (ip\. We use t to designate the adjoint of an operator, Herm(A) is the set 
of all Hermitian (self-adjoint) operators on A, and Pos(A) C Herm(A) is the set of 
all positive semidefinite operators on A. Given two operators M, N <E Herm(A), 
we say that M ^ N if N - M e Pos(A). Given an operator M e L(A, B), we 
will use the superscript M A ^ B to indicate its input and output spaces. We will 
use the symbol • to denote conjugation: given two operators M A ^ B and N A , we 
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define M-N = MNMl 

We will denote by the calligraphic letters Af A ^ B , T A ~^ B ,S A ^ B , . . . completely 
positive linear maps from L(A) to L(B); we will call these "superoperators". We 
will also write I A for either the identity operator on A, or the identity superop- 
erator on L(A); which one is meant should be clear from the context. 

In superscripts, we will simply concatenate letters to indicate the tensor 
product: for instance, M AB ^ CD 6 L(A®B,C®D). When applying an operator 
M A ^ B to a vector \ip) AC , we will usually omit the implicit identity: for instance, 

M A^B^AC = ( M A->B JC^I^AC 

Given an operator M AB e L(A <g> B) on a composite Hilbert space, we can 
define its partial trace on B, denoted either as Ty b [M ab ] or simply by M A , 
omitting the B in the superscript, as the unique operator N in L(A) such that 
Ty[ZN] = Tr[(Z <g> I B )M AB ] for all Z e L(A). In other words, the partial trace is 
defined as the adjoint of the superoperator T A ^ AB , T[N A ) = N A <g> I B under the 
Hilbert-Schmidt inner product (X, Y) := Tr[X*Y]. 

We will also need the concept of partial isometries. A partial isometry is an 
operator V A ^ B whose singular values are all either 1 or 0. Equivalently, they 
can be defined as any operator V A ^ B such that VW and VV^ are projectors. A 
full-rank partial isometry is a partial isometry V A ^ B whose rank is min{|A], \B\}. 

2.2 Quantum mechanics: an extremely short introduction 

Since one cannot hope to cover basic quantum mechanics in a few para- 
graphs, the author strongly recommends the interested reader to consult [NC00] 
or [Wat08] for a more complete introduction. Nonetheless, a short introduction 
to the basic concepts using the notation that we will use is given here for the 
sake of completeness. 

A quantum system A is represented by a Hilbert space A; a state of the system 
is a positive semidefinite operator p A e Pos(A) such that Tr[p A ] = 1. We also call 
these states density operators or density matrices and denote the set of all density 
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operators on A as D(A). A state p is considered pure if rank p = 1, in which case 
there exists a e A of norm 1 and such that p = |i/>) We sometimes also call 
general states mixed states when we want to emphasize the fact that the state is 
not necessarily pure. Let A and B be Hilbert spaces corresponding to quantum 
systems A and B; we can then consider them as a single composite system AB 
with A eg) B as its associated Hilbert space. By convention in this document, we 
will write systems on which a quantum state is defined as a superscript; for 
instance, p AB G D(A <g> B). The same convention will apply to all operators. If the 
input and output spaces of an operator are different, we will write an arrow in 
the superscript to indicate this: for example, M A ^ B e L(A, B). When we want to 
consider only part of a composite system, we take its partial trace on the system 
we want to eliminate. 

The operations that can be applied to a quantum system without making 
it interact with other systems correspond to the unitary operators on the associ- 
ated Hilbert space, namely all transformations of the form p — >■ U pW, where U is 
unitary. Since conjugation will be used so often in this thesis, we will use the no- 
tation A ■ B to denote ABA^ . Transformations involving interactions with other 
systems can be simulated by adding an ancillary system, applying a unitary on 
the composite system, and then tracing out part of the remaining system. 

Such a transformation can also be represented by a trace-preserving su- 
peroperator (sometimes called CPTP map, which stands for "completely pos- 
itive trace-preserving" map). It can be shown that a linear map J\f A ^ B is 
completely positive if and only if it can be written as Af(p) = J2i NipN}, 
where iVj e L(A, B); furthermore, any such linear map is trace-preserving (i.e. 
Tr[jV(M)] = Tr[M] VM) if J2i N ! N i = jA - We will sometimes call trace- 
preserving superoperators "quantum channels" when we want to emphasize 
that this is a transformation over which we have no control and wish to view as 
a noisy channel. 

There is also a class of operations that leaves the quantum system intact but 
changes its underlying Hilbert space. For instance, suppose we have a state p A 
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and want to embed the information it contains into the system B. An operation 
that does this is a partial isometry V A ^ B such that Tr^pV^] = 1 (i.e. the image of 
l /t must contain the support of p). We will sometimes call these simply "isome- 
tries" since they act as isometries on the part of A in which p lies. Such an oper- 
ation can be implemented by the superoperator V A ^ B (p A ) = VpV^ + J2i NipN}, 
where the JVj are such that VW + £\ iV/iV; = l A and Tr[NipNj] = for all i. 

Quantum systems can also be measured, yielding a classical output. In addi- 
tion to the measurement result, a measurement can also have a quantum residue, 
in case the measurement does not completely measure the state. To represent 
this, we will use a special type of trace-preserving superoperator that we will 
call a measurement superoperator. A measurement superoperator is a superoper- 
ator of the form M A ^ BX (a A ) = J2 X \x)(x\ x <8> N x a A N x \ where the \x) are all 
part of the same orthonormal basis for X, and the N A ^ B are arbitrary operators 
such that Ai is CPTP. The interpretation for this is that we get the measurement 
result x with probability Tr[N x aN^\, X is a classical register that holds the mea- 
surement result, and, if the measurement result was x, the B register gets the 
state N x aNl/ Tr[N x aN^]. If we are not interested in the quantum residue and 
only care about the classical result, we only need to describe the set of posi- 
tive semidefinite operators {N^N X } to describe the measurement, which is then 
called a positive operator valued measure, or POVM. We call a measurement super- 
operator complete if all of the N x are of rank 1, in which case the quantum residue 
is superfluous since it can be reconstructed from the classical result only. 

One particularly strange and interesting feature of quantum mechanics is the 
concept of entanglement. We say that a bipartite state p AB is entangled if it cannot 
be written in the form p AB = ^\ a^af ®u B . In other words, a state on AB in en- 
tangled if it cannot be expressed as a probabilistic mixture of separate states on 
A and B. An example of such a state is the following pure maximally entangled 
state that will be of great importance throughout the thesis: § AB = |<!>)(<I>| AB 
with = —j= Y,i \i) A <8> \i) B r where \A\ = \B\ and the \i) A and \i) B are stan- 

dard orthonormal bases for A and B. When \A\ = \B\ = 2, we will call this 
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state an EPR pair [EPR35], after Einstein, Podolski and Rosen who first noticed 
the phenomenon of entanglement and defined this state. With some abuse of 
terminology, we will call higher-dimensional instances of this state "EPR pairs" 
even when the dimension is not a power of two. 

Of central importance to this thesis is the concept of purification. Given a 
mixed state p A , it is always possible to find a pure state cu AB on a larger system 
such that p A = Tt b [uj ab ]. Note that this is also a purely quantum phenomenon: 
if one has a probability distribution p over a set X, it is impossible to find a 
single element (x, y) of X x 2) which then somehow ends up being distributed 
as p when we stop looking at the y part of it! 

An analogous fact holds for quantum channels: given a completely positive 
superoperator J\f A ^ B , it is possible to find a partial isometry Uj^ BE such that 
J\f(X) = Tr E [Ux ■ X] for every X e L(A). In other words, one can find a de- 
terministic operation that takes the input A to two output systems: the actual 
output of the channel B, and an environment system E. When we ignore the 
environment system, we get exactly the same channel. We call such a partial 
isometry a Stinespring dilation of J\f. 

2.3 Distance measures 

We will often need a notion of distance between quantum states, usually to 
state that the result of a particular protocol that we developed is "close" to some 
ideal output state that we would like to get. The distance we will use most of 
the time is called the trace distance; the trace distance between two states p and o 
is ||p — (j ||i, where ||M||i := Tr V MtM for any M A ^ B . In other words, it is equal 
to the sum of the absolute values of the eigenvalues of the matrix p — a. The 
reason for which this is a meaningful measure of distance is that it characterizes 
how easy it is for someone to determine through a measurement whether an 
unknown state is p or a, as was discovered by Helstrom: 

Theorem 2.1 (Helstrom's theorem [Hel69]). Let p A and o A be two density operators 
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on A, and suppose one holds p A with probability \ and a A with probability \, and one 
tries to determine which one it is by performing a measurement on A. Then, the best 
possible measurement will give the correct answer with probability | + — <j\\\. 

Proof. Let {M p , M A } be a POVM used to guess which state we have (since there 
are only two possible answers, one only needs two POVM operators). Then, the 
probability of guessing correctly is 

l - Tr[M p p] + l - Tr[M a a] = l - Tr[M p p + (I - M>] 

= ^Tr[M p p + a-M p a] 
= 1 - + 1 -Tr[M p (p-a)} 
^ 1 - + ^Tr[M p P + (p-a)P + ] 

^ + ±Tr[P + (p-v)P + ] 

1 1„ 
= 2 + 4^-^1 

where P + is a projector onto the eigenspaces of p — a corresponding to positive 
eigenvalues. The first inequality is due to the operator inequality P+(p — cr)P + ^ 
p — a, and the second inequality, to the operator inequality M p ^ I. The last 
equality is due to the fact that, since p — a has zero trace, P + {p — cr)P + must cor- 
respond to exactly half of the trace distance. Of course, equality can be attained 
if M p = P + . □ 

This means that, if the trace distance between two states is very small, some- 
one trying to determine which of the states an unknown state is in will be 
scarcely better off by doing the optimal measurement than by guessing ran- 
domly. In particular, if the output of a quantum protocol is £-close in trace dis- 
tance to the output of an ideal protocol, then, regardless of what we use the 
protocol for, we will almost never be able to tell the difference. 

A related notion is the fidelity between quantum states: 
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Definition 2.1 (Fidelity). Given two states p and a, their fidelity is defined as 

F(p,a) := HvW^llr 

One can easily see that the fidelity approaches one when two states get closer 
together; in fact, F(p,p) = \\p\\i = 1. An important property of the fidelity is that 
it is stable under purifications: given two states p A and o A , and a purification p AB 
of p A , then F(p A , a A ) = max CT As F(p AB , cr AB ), where we maximize over all purifi- 
cations of a A . This is due to Uhlmann's theorem [Uhl76] and will be proven in 
the next chapter as Theorem 3.1. One can also define a distance measure based 
on the fidelity: 

Definition 2.2 (Fidelity distance). Let p and a be two density operators. Then, their 
fidelity distance is defined as 

d F (p,a) := y/l-F(p,*)*. 

The fidelity distance is essentially equivalent to the trace distance, as shown 
by the Fuchs-van de Graaf inequalities [FvdG99]: 

Lemma 2.2 (Fuchs-van de Graaf inequalities). Let p e D(A) and a e D(A) be 
density operators on A. Then, 

i / i 

1 - ^ll^-^H 1 < f (a (7 ) < y 1 - ^IIp-^IIi- 

This implies that 

^IIP-^lli < d F (p,a) ^ VIIp-^IIi- 

2.3.1 The diamond norm 

It will also be convenient on a few occasions to be able to compare two su- 
peroperators. To do this, we introduce the so-called diamond norm: 

Definition 2.3 (Diamond norm). Let J\f : L(A) — > L(B) be any linear operator from 
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L(A) to L(B). Then, we define its diamond norm to be 

\\M\l:= max {N A ^ B ®I A '^ A ')(o AA ') 

a AA ' eD(A®A') 1 

where the maximization is taken over all mixed states a AA ' , and where N = A. 

This norm is usually called the completely bounded trace norm in operator the- 
ory and has been an object of study in that field for many years (see, for example, 
[Pau02] for an introduction to the area), but it was introduced to quantum infor- 
mation theory by Kitaev [Kit97] as the "diamond norm". 

The main reason for using the diamond norm to define a notion of distance 
on quantum channels is essentially the same as for using the trace norm on quan- 
tum states: it characterizes the optimal probability of successfully distinguishing 
two channels. Just as Theorem 2.1 shows that the optimal probability of distin- 
guishing the quantum states p and a is \ + \\\p — a\\ lf it is possible to show 
that the optimal probability of distinguishing the quantum channels M and M 
is given by \ + \ \M - M^. 



2.4 Information measures 

2.4.1 von Neumann entropy and derived quantities 

To be able to give solutions to information theory problems, we must have 
ways of measuring amounts of information. The fundamental quantity is the 
von Neumann entropy of a quantum state: 

Definition 2.4 (von Neumann entropy [vN32]). The von Neumann entropy of 
a quantum state p A is defined as H(A) p := — Tr[p A logp A ] (where, if p A = 
Y^i Mi^i) (4>i\ A is a spectral decomposition of p A , \ogp A = X^l°g( / ^)lV'i)('0i| j4 / an ^ 
we interpret log as 0). 

The von Neumann entropy measures the amount of information present in 
sequences of many copies of the same state, i.e. in p® n . More specifically, it 
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has been shown by Schumacher that an i.i.d. state p A can be compressed into 
n[H(A) p — 5] qubits with an error rate going to zero as n — > oo for any 5 > 
[Sch95]. Hence, the higher the entropy the less certain we are about the state 
and the more space we need to store it. 

Many other information measures are derived from the von Neumann en- 
tropy. The first one is the conditional von Neumann entropy: 

Definition 2.5 (Conditional von Neumann entropy). Given a state p AB , the condi- 
tional von Neumann entropy of A given B is defined as 

H{A\B) P :=H(AB) P -H(B) P . 

This is meant to describe the amount of uncertainty that we have about A 
if we already possess B. This interpretation was problematic for a long time, 
however, given that it can be negative (for instance, H(A\A')^ AA t = — 1, where 
\A\ = \A'\ = 2 and = \ Y%j=i However, it turns out to give the 

solution to the following problem: given a state p AB ® n between Alice and Bob, 
how many EPR pairs are required between Alice and Bob to teleport Alice's n 
shares to Bob (with free classical communication)? This task, called state merging 
[HOW07], is possible if we have n[H{A\B) p + 5] EPR pairs, and the error goes 
to zero as n — > oo for every 5 > 0. When H(A\B) p is negative, we can teleport 
while generating EPR pairs at this rate. 

Another information measure derived from the von Neumann entropy is the 
quantum mutual information: 

Definition 2.6 (Quantum mutual information). Let p AB be a quantum state. Then, 
the mutual information between A and B is defined as 

I(A; B) p := H(A) P + H(B) P - H(AB) P 
= H(A) P -H(A\B) P 
= H(B) p -H(B\A) p . 



15 



Without going into details, this quantity gives the entanglement-assisted 
classical capacity of memoryless quantum channels (channels of the form A/"®") 
[BSST02]. It is always nonnegative. 

We can also define a conditional version of quantum mutual information: 

Definition 2.7 (Conditional quantum mutual information). Let p ABC be a quan- 
tum state. Then, the mutual information between A and B given C is defined as 

I(A;B\C) p :=I(A;BC) p -I(A;C) p . 

This is also never negative [LR73], a fact that turns out to be highly nontrivial 
to prove. While we will only use this quantity in a technical proof in Chapter 4, 
it nonetheless has a natural interpretation through the task of state redistribution 
[DY06]. 

The coherent information is another measure that is important for channel 
coding problems. Unlike the previous one, this one has no classical analogue: 

Definition 2.8 (Coherent information). Let p AB be a quantum state. Then, the co- 
herent information from A to B is defined as 

I{A)B) p := -H{A\B) p . 

This is only positive when conditional entropy is negative, which only hap- 
pens when a state is entangled. This quantity gives the best known general rate 
for unassisted transmission of quantum data through i.i.d. quantum channels. 

2.4.2 Properties of the von Neumann entropy 

The family of entropic quantities defined above have a number of useful 
properties. In all of the statements below, let ip ABC be any pure state with re- 
spect to which all entropic quantities are computed: 

- H(A) = H{BC) 

- H(AB) = H(A) + H(B\A) 
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- H{A) = \l{A ] B) + \l{A ] C) 

- I{A)B) = \I{A-B)-\I{A-C) 

- H(A\B) = -H{A\C) 

All of the above can be easily proven from the definitions. One can also show 
that, on a mixed state p ABC , the following holds: 

- I(A;BC) > I(A; B). 

In other words, the mutual information is monotonic under the addition of more 
subsystems; by taking into consideration an additional system C in addition to 
B, one cannot lose information about A. This comes from the strong subad- 
ditivity of the von Neumann entropy [LR73] and its proof is rather involved 
compared to the previously stated properties. 

2.4.3 One-shot information measures 

All of the above quantities were relevant for tasks involving n copies of a 
state, or n uses of a quantum channel, and where we then take the limit as 
n — > oo. However, in this thesis, we will generally start from protocols involving 
a single use of an arbitrary channel on a given arbitrary state, and then derive 
this special case by considering a "single use" of the channel J\f® n . We will there- 
fore need information measures that are relevant for the one-shot case and that 
reduce to the above quantities in the case of multiple copies. 

The first one is the min-entropy of a quantum state: 

Definition 2.9 (Quantum min-entropy). Let p A be a quantum state. Then, its min- 
entropy is defined as 

H min (A) p := -logmin{A : p A < XI A }. 

In other words, the min-entropy is the negative logarithm of the largest 
eigenvalue. Classically, this definition goes back to Chor and Goldreich [CG88], 
and was generalized to quantum information by Renner [Ren05]. 

Renner also defined a conditional version of the quantum min-entropy: 
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Definition 2.10 (Quantum conditional min-entropy). Let p AB e Pos(A®B). Then, 
the conditional min-entropy of A given B is defined as 

H min (A\B) p := -logmin {Tt[(t b ] : a B G Pos(B),p Ai? < I A <g> a B ) . 

This quantity measures how much uniform and private randomness we can 
extract from a random variable that is correlated with a quantum state that an 
attacker might possess, as shown in [Ren05]. It is also the quantity that governs 
how many bits of key must be used to encrypt the A part of a quantum state p AB 
against an adversary that knows B [DD]. 

While much more is known about the min-entropy the following slightly 
more unwieldy quantity is used in many proofs: 

Definition 2.11 (Quantum conditional 2-entropy). Let p AB e Pos(A ® B). Then, 
the conditional 2-entropy of A given B is defined as 

H 2 {A\B) p := - log inf Tr \{{a B ® l^'^p^f . 

Note that the conditional 2-entropy is always lower-bounded by the condi- 
tional min-entropy: 

Lemma 2.3. Let p AB e Pos(A <g> B); then H min {A\B) p ^ H 2 {A\B) P . 

Proof. Let A = 2~ Hmin( - A \ B ^», and let a B be a normalized density operator such that 
p AB ^ \1 A <S> cr B ; assume without loss of generality that o B is positive definite 
(otherwise redefine B as the support of p B ). Also, let P AB = I AB — (p AB )° (i.e. P is 
a projector onto the kernel of p AB ). Then, using the fact that X ^ Y =^ X~ 1 / 2 ^ 
y V2 ( w hich one can derive from Propositions V.I.6 and V.I.8 in [Bha96]), we 
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have that \ l ' 2 (p AB + ePy 1 ' 2 > (I A <g> a B + eP)- 1 ' 2 , and therefore 

Tr [((I A ®a B + eP)- 1 < A Tr [((p AB + ePy^p™)' 

= \Ti[p AB ] 
= A. 

Taking the limit as e — > yields the lemma. □ 

One can also define a "max-entropy" as done in [KRS09] in the following 
manner: 

Definition 2.12 (Quantum conditional max-entropy). Let ip ABC be a pure state. 
Then, the conditional max-entropy of A given B is defined as 

H mSLX (A\B)^, := —H min (A\C)ip. 

Since H min (A\C)^, is invariant under unitaries on C, this does not depend on the par- 
ticular choice of purification. 

Note that there are at present two competing definitions of the max-entropy 
in circulation, at least in the non-conditional case. The other one is simply the 
logarithm of the rank of a state. However, the author feels that Definition 2.12 
is more compelling given the various results in [KRS09] and [TCR09], as well as 
the results in this thesis. 

In [KRS09], the authors give a nice direct interpretation of both the min- 
and the max-entropy: given a state p AB the conditional min-entropy H min (A\B) p 
quantifies how close to a maximally entangled state we can make p AB by apply- 
ing an arbitrary CPTP map on B: 

2 -H min (A\B) P = \ A \ max F((I A (g) F)(p AB ), $ AA ') 2 

where T ranges over all CPTP maps from L(B) to L(A'), and A 1 is a quantum sys- 
tem of the same dimension as A. Likewise, the max-entropy H mgx (A\B) p charac- 
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terizes how close the state is to being decoupled and uniform on A: 

(T S £D(B) V ' 

It can also be shown that H min (A\B) p < H(A\B) ^ iJ max (A| J B) p ([TCR08], Lemma 
2). 

One problem with all of the above quantities is that they are very sensitive 
to small variations in the state on which they are defined, whereas most of the 
quantities that we are bounding with them are not. Hence, if we use these quan- 
tities directly, we can end up with very poor bounds in certain cases. For this rea- 
son, we define "smooth" versions of these entropies. Instead of computing the 
entropic quantities directly on the state we are given, we optimize them over an 
e-ba\\ around the state; this idea was introduced by Renner and Wolf in [RW04]. 
For any p AB G D(A ® B), define 

®(p,e) := {p AB :Tr[p]^l,d F (p,p)<e}. 

We then define the following quantities: 

Definition 2.13 (Smooth conditional min-entropy). Let p AB be a quantum state. 
Then, the e-smooth conditional min-entropy of A given B is defined as 

HLJA\B) p := max H min (A\B) a . 

Definition 2.14 (Smooth conditional 2-entropy). Let p AB be a quantum state. Then, 
the e-smooth conditional 2-entropy of A given B is defined as 

HI(A\B) p := max H 2 {A\B) C . 
Definition 2.15 (Smooth conditional max-entropy). Let p AB be a quantum state. 
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Then, the e-smooth conditional max-entropy of A given B is defined as 



H max (A\B) a . 



As mentioned before, these quantities reduce to von Neumann quantities 
in the i.i.d. case. This is formalized in the following theorem, called the fully 
quantum asymptotic equipartition property [TCR08] by Tomamichel, Colbeck and 
Renner: 

Theorem 2.4 (Fully Quantum Asymptotic Equipartition Property). Let p AB be a 

density operator, e > 0, rj ^ 2"^ Hmin( - A ^ p +2^ Hra ^ A ^ p + 1 ^ 2a/P4| + 1, and n e N. 
Then, ifn^\ log J,, 



2.5 Quantum channel capacities 

There are many variants of quantum channel capacities that can be defined, 
reflecting the large number of possible data transmission scenarios in which 
quantum channels can be useful. The two main ones are the classical capacity 
(the best rate at which we can send classical data through a quantum channel) 
and the quantum capacity (the best rate at which we can send arbitrary qubits 
through the channel). We can also define entanglement-assisted capacities of 
these two problems, in which the sender and the receiver share an arbitrary 
number of EPR pairs that they can use for free to help them transmit either clas- 
sical or quantum data through the quantum channel, as the case may be. We will 
not be concerned with the classical capacity of quantum channels in this thesis, 
however, and the entanglement-assisted classical capacity is simply twice the 
entanglement-assisted quantum capacity. We shall therefore only talk about the 
unassisted and entanglement-assisted quantum capacities. 

Furthermore, we can either consider what we can do with a single use of 
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an arbitrary channel (which is the most general version of the problem), or we 
can restrict ourselves to i.i.d. channels (i.e. n copies of a relatively small channel 
A/"). The i.i.d. case, in addition to being a practically relevant special case, is also 
typically much easier to solve. We will consider both problems in this thesis: 
for all of the problems that we will consider, we will first prove a theorem for a 
single use of an arbitrary channel, and we will then apply it to an i.i.d. channel. 
The goal of this section is to define these problems and say a few words about 
them. 

2.5.1 One-shot capacities 

We first consider the simpler case of one-shot capacity. (Simpler to define, 
not to solve!) By the term "one-shot", we mean that we will consider protocols 
involving a single use of a channel, as opposed to using the same channel n 
times. Suppose Alice would like to send an arbitrary quantum system M to Bob 
using the channel J\f A '^ c a single time. Alice will therefore encode her message 
M into the channel input A' using some encoding CPTP map S M ^ A ' . Upon 
receiving the channel output C, Bob will attempt to recover M using a decoding 
CPTP map x>°^ M . One would like to make sure that, regardless of the actual 
state of the message system M, Bob gets that same state at the output. When 
we consider every possible state of M, we must also include cases in which the 
contents of M are entangled with another system. While M can be entangled 
with an arbitrarily large system, it is mathematically equivalent to consider only 
entanglement with another system R of dimension \R\ = \M\. 

Of course, since we are only using the channel once, one cannot hope in gen- 
eral to have no error whatsoever. We must therefore decide on an error level that 
we are willing to tolerate, and then look at how big a message we can transmit 
given this constraint. 

Taking all this into consideration, our goal is to find an encoder-decoder pair 
that satisfies 

||(T?oA/-o£)(^ M )-^ M |li^ 
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for every pure state ip RM . An alternative way of writing this is via the diamond 
norm on superoperators [Kit97]: 



\\VoM oS -l M \\ o ^e. (2.1) 

In other words, the composition of the encoder, channel, and decoder, must be 
nearly indistinguishable from the identity channel. 

This quantity, however, is rather difficult to bound directly because of the 
optimization over the input state. Fortunately, there exists an essentially equiv- 
alent criterion which is much easier to establish for a given protocol. Instead 
of considering the worst case input, one can consider only a fixed maximally 
entangled state $ RM between R and M: 

\\(VoNog)(<S> RM )-<S> RM \\ i ^e. (2.2) 

Requiring that an encoder and decoder fulfill this condition is weaker, but it 
turns out that by slightly reducing the dimension of the input system of the 
channel, one can turn an encoder-decoder pair that fulfills (2.2) into one that 
fulfills (2.1) [KW03]. 

Alice and Bob might also have EPR pairs at their disposal to help them in- 
crease the transmission rate; we call this the entanglement-assisted capacity. The 
setting is the same as above, except that Alice and Bob start out with an addi- 
tional state $ AB that they are allowed to consume at will to help them in their 
task. We now have an encoder £ MA ^ A ' and a decoder x> CB ^ M , and we want to 
ensure that 



(V o M o S) ($ RM ® $ AB ) - $ RM 



s$ e. 

i 



Classical one-shot capacities were first considered by Han and Verdu [HV94], 
who pioneered the so-called information-spectrum approach to the capacity of 
general non-i.i.d. channels. Using an approach that is much closer to the one 
used in this thesis, Renner, Wolf and Wullschleger [RWW06] used classical ver- 
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sions of min- and max-entropies to derive bounds for the one-shot capacity of 
classical channels. On the quantum side, Buscemi and Datta [BD09] consider the 
one-shot capacity using different tools from the ones used here. 

2.5.2 Capacities of memoryless channels 

We now wish to consider coding for channels of the form A/" 8 " 1 , where n 
grows arbitrarily. In this case, we will want to find the best rate (number of 
qubits sent divided by number of channel uses) at which we can send quantum 
data such that the error rate goes to zero as n — > oo. We call such channels mem- 
oryless channels, since the channel behaves exactly the same way from one use 
to the next without "remembering" previous inputs; we also sometimes call this 
the "i.i.d. case". The definitions in this case are slightly more involved due to 
the fact that we need a series of encoders and decoders that grows with n. We 
begin by defining the unassisted quantum capacity: 

Definition 2.16 (Quantum code). An (n, R)-codefor a quantum channel J\f A '^ c is 
an encoding superoperator £ M ^ A ' n and a decoding superoperator T>° n ~* M , where M is 
a 1 nR -dimensional quantum system. 

Definition 2.17 (Achievable rate). A rate R is said to be achievable for a channel 
J\f A '^ c if there exists a sequence of(n, R)-codes (£ n , V n ) such that 



Definition 2.18 (Quantum capacity). The quantum capacity Q(Af) of a quantum 
channel J\f is the supremum of all achievable rates for this channel. 

Despite considerable efforts, we do not yet have a satisfactory characteriza- 
tion of the quantum capacity. We do have a general lower bound for the capacity 
[Llo96, Sho02, Dev05] which is given by the coherent information (see Theorem 
3.15). This bound is known not to be tight, however [DSS98], and a very strange 
phenomenon appears: this capacity is not additive. More specifically, there exist 




(2.3) 
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pairs of quantum channels Af and M such that the capacity of both M and M is 
zero, while the capacity of J\f <g> M is strictly positive [SY08]. 

We now turn to the definitions relevant for entanglement-assisted capacity: 

Definition 2.19 (Quantum entanglement-assisted code). An (n, R, E)-code for a 
quantum channel J\f A '^ c is an encoding superoperator £ MA ^ A ' n r and an associated 
decoding superoperator x>° nB ^ M , such that \M\ — 2 nR and \A\ = \B\ = 2 nE . 

Definition 2.20 (Achievable rate). A rate R is said to be achievable for a channel 
J\f A '^ c if there exists a sequence of (n, R, E)-codes (C n ,V n ) (for arbitrary finite E) 
such that 

lim \\M n -I M \\ =0. (2.4) 

where M n is the superoperator M n (p) = (D n o N® n o £ n )(& AB ® p). 

Definition 2.21 (Entanglement-assisted quantum capacity). The entanglement- 
assisted quantum capacity Qe{N) of a quantum channel J\f is the supremum of all 
achievable rates for this channel. 

2.5.3 Regularization and single-letter converses 

When we set out to characterize the capacity of a type of memoryless chan- 
nel, we ultimately want to get an expression that can be efficiently computed 
from the description of the channel. Unfortunately, in quantum information 
theory, we seldom achieve this ideal. What usually happens is that we are able 
to give an easily computable achievable rate region, meaning a set of transmission 
rates that we know can be achieved, and we can often give an uncomputable 
expression for the true capacity. In some cases, such as for the unassisted trans- 
mission of quantum data through quantum channels, we know that there is a 
gap between the two expressions [DSS98]. The same is true for the transmission 
of classical data through quantum channels [Has09]. In other cases, such as the 
entanglement-assisted transmission through quantum multiple-access channels 
[HDW08], we do not know whether this is the case. Only in a few rare instances 
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can we show that the two coincide; the main example is the entanglement- 
assisted quantum (and classical) capacities of quantum channels. 

The achievable rate region and the uncomputable expression for the capacity 
usually take particular forms. For the sake of concreteness, we will consider the 
case of the transmission of quantum data through quantum channels, but the 
situation tends to be very similar in other settings. The best known achievable 
rate for this task is expressed in the following theorem ([Llo96, Sho02, Dev05], 
see also Theorem 3.15 for a proof): 

Theorem 2.5. Let J\f A '^ c be a quantum channel, let a AA ' be any pure state with 
A' = A, and let p AC = J\f A '^ c '(a). Then, any rate R < I(A)C) P is achievable for 
the transmission of quantum data through N. 

The main feature of this theorem is that it states the existence of protocols that 
send quantum data using the channel n times, but whose rates can be computed 
by looking at a single instance of A/*. Indeed, the state p on which we compute 
I{A)C) P is a state produced by a single application of Af. Furthermore, the proof 
of the theorem shows that these protocols can be constructed by choosing codes 
that "look" like the state {<j A ')® n at the channel input. It therefore gives us some 
information about the structure of codes that achieve the rates advertised in the 
theorem statement. 

The main question at this point is whether this theorem is optimal or not: 
is it possible to create codes that go beyond the highest rate this theorem can 
give? Since the above theorem holds for any channel, it is certainly possible to 
look at the rates we obtain for channels of the form N® k (i.e. if we regard k uses 
of A/" as a single channel). If the above theorem were optimal, then doing this 
should never give us a higher rate than simply looking at M alone. In [DSS98], 
the authors show that it is in fact possible to get a higher rate this way, thereby 
showing Theorem 2.5 to be suboptimal. 

This raises a further question: can we in fact get the optimal rate only by 
using the above theorem on some large number of copies of the same channel, 
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or do we need to do something altogether different? The answer is that taking 
many copies is sufficient, and is expressed in the following theorem: 

Theorem 2.6. Let J\f A '^ c be a quantum channel. Then, the capacity ofN is given by 

C= sup -I(A)C n ) p (2.5) 

where a AA '" ranges over all pure states, A = (A')®", p AQn = A/"® n (o-) and n ranges 
over all positive integers. 

This is what we call a regularized converse or multiletter converse: it is a con- 
verse of Theorem 2.5, provided that we "regularize" it by considering many 
copies of the channel. This is not a very strong characterization of the capacity. 
One reason for this is that we cannot compute it: we have no bound on how 
large n has to be to get within a given factor of the capacity. Another perhaps 
even more depressing reason is the way we prove this last theorem: we look at 
an arbitrary code achieving quantum transmission, use it as the state o in the 
above theorem, and show that the resulting ^I(A)C) P is lower-bounded by the 
rate of the code. Since there exists a code for every rate below the capacity (by 
definition), the right-hand side of Equation (2.5) can never be lower than the ca- 
pacity. This makes the above theorem nearly tautological: if we choose the best 
possible code, then we reach the capacity. It says nothing whatsoever about the 
structure of capacity-achieving codes, which is perhaps the main motivation for 
studying channel capacity problems. 

As mentioned earlier, however, it is sometimes possible to prove that regu- 
larization is not necessary, and that a theorem that considers only one copy (like 
Theorem 2.5) is optimal. When this is the case, we say that we have a single-letter 
converse. The main example in quantum information theory is the entanglement- 
assisted quantum and classical capacities (see again Theorem 3.15). A further ex- 
ample is the entanglement-assisted quantum and classical capacities of quantum 
channels with side-information at the transmitter, which is studied in Chapter 4, 
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the single-letter converse being given as Theorem 4.5. When we have a single- 
letter converse, it means that the code structure used in the proof of the corre- 
sponding theorem is in fact the optimal way to code for this type of channels. We 
can then say that we have a good grasp on how the channel carries information. 

Finding expressions for the various capacities that are easily computable and 
that give us information about the structure of optimal codes is one of the main 
goals of information theory, and it has generally been rather difficult to achieve 
in the quantum setting. Finding such expressions for the most basic quantum 
capacities (such as the unassisted quantum and classical capacities of quantum 
channels) is one of the most central open problems in the field today. 

2.6 The duality between vectors and operators 

Periodically throughout this thesis it will be extremely useful to turn multi- 
partite pure states into operators, and vice versa. This is simply a generalization 
of turning a "ket" into a "bra": if we have a vector \ip) e A, then we can turn it 
into an operator e L(A, C) from vectors to the complex numbers, by defin- 
ing (ip\ as the only operator in L(A,C) such that (ip\(p) = (\ip), \<p)), where (•, •) 
denotes the inner product in A. We can turn multipartite states into more in- 
teresting operators, however. Endow A and B with standard orthonormal bases 
{|aj) A } and {\bi) B } respectively, and let op A ^ B : A <g> B — > L(A, B) be defined as 

°PA^i?(l^>l^>) = \ b j)(ai\ Vi,j. 

This operation depends on the choice of standard basis; therefore, whenever it is 
used, a particular choice of basis is implied. Since this choice will never matter 
in this thesis, we shall not explicitly define these bases. 

The following properties of the op transformation will be needed: 

Lemma 2.7. Let \ip) AB and \f) AC be any vectors in A<g) B and A® C respectively. Then, 

o^ B m AB m AC = ov A ^ c {W) AC m AB - 
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Proof. Let {!&«)}/ and {|q)} be the canonical bases for A, B and C respec- 

tively, and let 

m AB = J2 a ^\ b ^ 

ij 
ij 

Then, 

0Pa^b(\^) AB M AC = ^2a ij kt \b j )(a i \a k )\ci) 

ijkl 

= J2a ij (3 il \b j )\c l ) 

ijl 

oVa^c(\v) AC M) AB =Y,ai j Pki\ci)(a k \a i )\b j ) 

ijkl 

= ^2a ij pu\b j )\ci). 

ijl 

□ 

Lemma 2.8. Let \ip) AB be any vector in A <g> B, let A' be a system of equal dimension to 

A, and let \§) AA ' = -j= £V |a^) ja^), where the |aj)'s and |a-)'s are the canonical bases 

v l-^l 

of A and A 1 respectively . Then, 

^v A ^m AB m AAI = W B - 

Proof Let \ip) AB = J2ij a ij\ a i) \bj)', we then get that 

VW\op A ^) AB m AA ' = Y.^h){*iW)K) 

ijk 

= Y,^\o A '\h) B 

ij 

= \^) A ' B - 



Lemma 2.9. For any e A <g> B and any M A ^ C , we have that 

op B _> c (MM) = Mop B _^(|^)) 
op c _> B (M|^)) = o Va ^M))M t 



where the T subscript denotes transposition. 

Proof. Let \ip) = J2ij a ij\ a i)\ b j) and M = E fe * 7fc;| c fc>( a *l- Then, 

°Pb^c( m \4')) = °Vb^c I ^ctij-fkilc^iaila^lbj 

\ ijkl 

V ijk / 

ijk 

Likewise, 
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MvPb^aW)) = ^Uijlki\ck){ai\ai)(b 

ijkl 

= ^2ciijlki\ck)(bj\. 

ijk 

The other statement is proven in the same manner: 

0Vc-m( m \*I>)) = "£2aijlkiop c ^ B (\c k )(ai\ai)\bj)) 

ijkl 

= ^2aij-iki\bj){c k \ 

ijk 
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and 



□ 



°Pa-^b(\^)) m t = ^2cxijlki\bj)(ai\ai)(c k \ 

ijkl 

= y ^2i^ijlki\bj){c k \- 

ijk 



Lemma 2.10. Let |^> e A <g> B. Then, Tr B [^ AB ] = op B _> A (|^)) op B _^(|^))t. 
Proof. Let = J2i ®i\i J i) A \ l Pi) B be the Schmidt decomposition of 

i 

and 

°Pb^a(\^)) PB^A(\^)y = '%2<Xi<Xj\ll>i){ ( Pi\<Pj){ , &j\ 

ij 

i 

□ 

We will also need to turn operators into vectors through the same process. 
For any pair of systems A and B, define vec : L(A, B) — > A <8> B as the transforma- 
tion: 

vecflftj-Xdil) = \oi)\bj). 

It is simply the inverse of op. 

We will need the following property of the vec transformation: 

Lemma 2.11. Let M A ^ B and N A ^ B be arbitrary operators. Then, TrfA^M] = 
vec(A^)t V ec(M). 
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Proof. LetM = X!,., m ij b j)i a i and N = Y.ij n ij\bj)(ai\- Then, 

Tr[N*M\ = ^TrKX/l«fc)(^l^)(a,|] 

ij 

and 

vec(A^) 1 " vec(M) = ^ 777^-77^(^1^) 

□ 



CHAPTER 3 



THE DECOUPLING THEOREM 



One peculiar feature of quantum information theory is that some of the sim- 
plest coding theorems that we know come from theorems that tell us how to 
remove correlations, even though the goal of an error-correcting code is to es- 
tablish correlations between the sender and the receiver. The basic idea is the 
following: to prove a coding theorem, we generally need to assert the existence 
of a decoder of some sort; this decoder must be able to reproduce a particular 
state with good fidelity given only partial or noisy information. By purifying all 
systems, we can consider all subsystems that are not held by the decoder. These 
will generally include a subsystem purifying the state that the decoder needs to 
produce, as well as systems considered as part of the environment or that we 
otherwise don't care about. It turns out that, in such a case, a decoder exists 
if and only if the system purifying the desired state and the "environment" are 
close to a product state. The theorem that ensures this is called Uhlmann's theo- 
rem, and is the subject of the next section. Of course, for this approach to work, 
we need a way to ensure that two systems are close to a product state. Section 
3.2 will present a very general decoupling theorem with which we will prove all 
of the coding theorems in this thesis. 

Although some elements of this approach were already used earlier, this 
method came into its own with the discovery of the state merging protocol 
[HOW07], and later, the Fully Quantum Slepian-Wolf (FQSW) [ADHW06] pro- 
tocol. A whole array of results, including the "mother" and "father" [DHW03], 
can be easily derived from either of these protocols, such as the quantum reverse 
Shannon theorem [BDH+06], the Lloyd-Shor-Devetak (LSD) theorem [Llo96] 
[Sho02] [Dev05], one-way entanglement distillation [DW05], and distributed 
compression [ADHW06]. This chapter will present a generalization of both 
FQSW and state merging that is much more flexible and which can therefore 
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be used in more diversified contexts. 



3.1 Uhlmann's theorem 

Before starting, we will need to formally define what we mean by purification: 

Definition 3.1 (Purification). Let p A e D(A) be any normalized density operator. 
Then, a purification of p A is any normalized vector e A ® B, with B an arbitrary 
quantum system, such that Tr B [\ip){ip\ AB ] = p A . We then call B the purifying system. 

For any density operator, a purification exists, and is unique up to isometries 
on the purifying system. 

Uhlmann's theorem was first shown in [Uhl76]; the proof given here essen- 
tially follows the one in [Wat08]. 

Theorem 3.1 (Uhlmann). Let p A and o A be two quantum states, and let \ip} AB and 
\ip) AC be purifications of p A and o A respectively (the purifying systems B and C need 
not be isomorphic). Then, 

F(p A ,a A ) = max|MvV)| (3.1) 

where the maximization is over all partial isometries from B to C. 

Proof. Let U A ^ B and W A ^ C be partial isometries such that \ip) AB = vec(Uy/p) 
and \(p) AC = vec(Wy/F). Then, 

F(P A ,* A ) = \\V~PH\i O- 2 ) 
= \\U^p^W\ (3.3) 
= max | Tr [V U^fp^/aW^] | (3.4) 

= max \vec(VU^p) j vec(W^)\ (3.5) 

= max (3.6) 
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where we have used Lemma 1.6 from the appendix on line (3.4) and 2.11 on line 
(3.5). □ 

The main use of this theorem for coding purposes is that it often gives us 
a decoder "for free". Indeed, assume that, at the end of the execution of a 
channel coding protocol, we have a tripartite pure state \ip) BER , with the three 
subsystems representing the shares of Bob, the environment, and a "reference" 
system which purifies the qubits that Alice wanted to send to Bob. Now, sup- 
pose that we were able to show that the environment is nearly uncorrelated 
with the reference: F (ip RE , p R <g> a E ) ^ 1 — e. Then, given a product purifica- 
tion \ip) RB CS> \0 EB °f P R ® ® E , there exists a partial isometry y B ^ BB such that 
F (v\^) BER , \<p} RB ® >l-e. 

Since we generally use the trace distance rather than the fidelity, the follow- 
ing corollary of Uhlmann's theorem (Lemma 2.2 in [DHW05]) will be very useful 
to us: 

Corollary 3.2. Let \ip} AB and \tp) AC be two quantum states such that \\ip A — <^ A || 1 ^ e. 
Then there exists an isometry U B ^ C such that || {U B ^ C ■ ip AB ) — </? j4C ' 1 1 x < 2y/e. 

Proof. If \\ip A — <p A \\i ^ e, then by the Fuchs-van de Graaf inequalities [FvdG99] 
(Lemma 2.2) we have that F(ip A , tp A ) ^ 1 — |e. By Uhlmann's theorem, this 
means that there exists a partial isometry U B ^ C such that F{U B ^ C ■ ip AB ', tp AC ) ^ 
1 — \e. A second application of the Fuchs-van de Graaf inequalities concludes 
the proof. □ 

3.2 The decoupling theorem 

To be able to use Uhlmann's theorem to derive a coding scheme, we need 
a way to ensure that two quantum systems are nearly uncorrelated. The main 
theorem of this section will achieve this for us. 

Suppose Alice holds the A share of a mixed state p AR . We would like to per- 
form an operation on Alice's system to ensure that her share is decoupled from 
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the reference. We will consider a very general operation: a fixed unitary trans- 
formation followed by an arbitrary completely positive superoperator T A ^ E . 
We will show that if we choose the unitary transformation randomly according 
to the Haar measure (which can essentially be viewed as the uniform distribu- 
tion over all unitaries), then the resulting protocol will on average perform well. 
This generalizes all of the decoupling theorems in the literature that the author 
is aware of, including the Fully Quantum Slepian-Wolf theorem [ADHW06], 
which corresponds to the special case in which T traces out part of the system, 
as well as the state merging [HOW07] theorem, in which f A ^ EX corresponds to 
making a rank-^l measurement and then storing the measurement result in the 
classical register X and the residual quantum state in E. One advantage of this 
generalization is that it allows us to choose T to be a very complex operation; 
one especially interesting example is to pick T to be the complementary channel 
(the channel to the environment) of a channel we are interested in coding for. 
Another advantage is the use of (smooth) conditional 2-entropies rather than 
purities and dimension bounds as was done in all of these theorems (although, 
in the case of state merging, this was already done in [Ber08] and [BCR09], and, 
in the case of FQSW, by Hayden in [Hay06]). This theorem allows to show di- 
rectly that the environment is decoupled from any system of interest, which is 
usually what we need to show. 

We will calculate how close the remaining state on ER is to a product state 
in the main theorem of this section (Theorem 3.7). To get to it, however, we will 
first need the following four technical lemmas. The first one is simply a trick 
that we will use to compute the trace of the square of a matrix: 

Lemma 3.3 (Swap trick). Given two operators M e L(A) and N e L(A), then 
Tr[MiV] = Tr[(M <g> N)F], where F swaps the two copies of the A subsystem. 

Proof. Write M and N in the standard basis for A: M = m «jl00'l an d N = 
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J2ki n ki\k)(l\- Then, 



Tr[(M ® N)F] = Tr ^m^n^O'l ® l^l F 



(3.7) 



= Tr ^2mijn kl \i)(l\ <g> |A;)0'| 



(3.8) 



— ^ ] rriijUji 



(3.9) 



= Tr [MJV]. 



(3.10) 



□ 



The second lemma involves averaging over Haar-distributed unitaries. 
While it would take us too far afield to formally introduce the Haar measure, 
it can simply be thought of as the uniform probability distribution over the set 
of all unitaries on a Hilbert space. The following then tells us the expected value 
of U m ■ M (with M e L(A)) when U is selected "uniformly at random": 

Lemma 3.4. Given an operator M e L(A® 2 ), we have that 



where a and /3 are such that Tr[M] = a\A\ 2 + f3\A\ and Tr[MF] = a\A\ + f3\A\ 2 , and 
where dU is the normalized Haar measure on U(A). 

Proof. This is a standard result in Schur-Weyl duality. This is a special case of, for 
instance, Proposition 2.2 in [CS06]. To see this, note that Proposition 2.2 states 
that E : L(A® 2 ) — > L(A® 2 ) is an orthogonal projection onto span{I, F} under the 
inner product (A, B) = Tr[A*B]. Hence, E(M) can be written as al AA> + f3F A as 
claimed, and the conditions Tr[IE(M)] = Tr[M] and Tr[FE(M)] = Tr[FM] must 
be fulfilled, and these lead to the two conditions on a and (3. □ 

The following bounds the ratio of the purity of a bipartite state and the purity 




(3.11) 
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of the reduced state on one subsystem: 

Lemma 3.5. Let £ AB e Pos(A <g> B) be any positive setnidefinite operator. Then 



Tr 



^ \A\ 



(3.12) 



|Ap Tr [£ b2 ] 

Proof. Letting A' be a system isomorphic to A, we first prove the left-hand side 



Tr 



= Tr 
= Tr 
= Tr 
= Tr 



Tr A [^f 
Tr A Tr A 
A " 'I 



(^ B ®I A ')(^' B 



< V^Tr <g> F') 2 ] Tr <8> I A ) 2 ] 



Tr 
IAI Tr 



ab 2 



AB2 



(3.13) 
(3.14) 
(3.15) 
(3.16) 
(3.17) 
(3.18) 
(3.19) 



where the inequality is due to an application of Cauchy-Schwarz. The right- 
hand side follows from the fact that £ AB < \A\I A <g> £ B . This can in turn be seen 
from the fact that \A\I A <g> £ B = jjffi U A ■ £ AB , where the U/s are Weyl operators 
with Ui — I. □ 

In the main proof, we will need to bound the trace distance between two 
states using the 2-norm. The following lemma will allow us to do this: 

Lemma 3.6. Let M e L(A) be any operator and let a e Pos(A) be a positive definite 
operator. Then, 

|| M ||i < yjTr[a] r Tr[a- 1 / 4 Ma- 1 / 2 M^a- 1 / 4 ']. (3.20) 
In particular, if M is Hermitian, then 



||M||i <: JTr[a]Tr[((7- 1 /4M ( 7- 1 /4)2]. 



(3.21) 
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This is a slight generalization of Lemma 5.1.3 in [Ren05]; we give a different 
proof here for completeness: 



Proof. 



M\\ 1 = max\Tr[UM\\ (3.22) 



u 



= max | Tr [(o- 1 / 4 ^ 1 / 4 ) (a^Ma- 1 / 4 )] | (3.23) 
^ max y / Tr[(a 1 /4 f/(T i/4)( (T i/4 [/ t( (7 i/4)] Tr [a-^Ma-^M^- 1 / 4 ] (3.24) 



max Tr^Ua^W] Tr [ff-V^M^^Mtfr- 1 ^ (3.25) 
= ^Tr[a] Tr [t^M^^Mt^-V^ (3.26) 

where the first equality is an application of Lemma 1.6 and the inequality re- 
sults from an application of Cauchy-Schwarz, and the maximizations are over 
all unitaries on A. The last equality follows from 



m^Ti[a l/2 Ua l/2 U ] ] < max ^Tr[a] TrlUa^WUa^W] 

= Tr[(j] 

^ maxTr[(7 1/2 C/(7 1/2 C/ t ]. 

□ 

We are now ready to prove the main theorem: 

Theorem 3.7. Let p AR be a density operator, T A ^ E be any completely positive super- 
operator, and define u A ' E := (T<8> I A ')($ AA '). Then, 

I \\T{U ■ p AR ) -oo E ® p^dU < 2-^'^-^^ (3.27) 
where j -dU denotes the integral over the Haar measure over unitaries U A acting on A. 
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Proof. Throughout the proof, we will denote with a prime the "twin" subsys- 
tems used when we take tensor copies of operators, and F s denotes a swap 
between S and 5". 

We first use Lemma 3.6. Letting a E and ( R be any normalized, positive defi- 
nite density matrices on E and R respectively, we get: 



\\T{u ■ P AB ) - u E ® p^ 



^ WTr 



((a E <g> C R )- 1/4 (T(U ■ p AR ) -uj e ® p R )(a E <g> C^)" 174 )' 



(3.28) 



Define the CP map T A ^ E as T(0 = a s 1/4 T(£)<r £ 1/4 , the state p Ai? as p Ai? = 



( -R-V* p AR ( -R-V* f and ^ state qak as -A-is = 7-($a'A)_ We then rewrite the 



~A'E ^ ~A'E 



A'A\ 



above as 



\\T(U ■ p AR ) - u E ® p*^ < WTr ((f (17 • p AR ) - u) E ® p*)) 
Using Jensen's inequality, we can get 

J \\T(U ■ p AR ) - co E ® p R \\ l( lU < w/|Tr 

We now simplify the integral: 



(3.29) 



(f (17 • ~p AR ) -Cj e ® ~p R ) 



dU. 
(3.30) 



Tr 
Tr 

= / Tr 



T(U ■ p AR ) -u E ®p R 

2" 



dU 



T(U ■ p AR )) 
t(U ■ p AR )) 
= J Trl (f(U.p AR )) 



dU-2 / Tr 



T(U ■ p AR ) (u E ® p R ) 



dU + Tr 



dU - 2 Tr 



f U-p AR dU) {u E ®p R ) 



+ Tr 



(w s ® p R ) 
{u E ®p R Y 



dU-Tr[(u E ) 2 ]Tr[(p R ) 2 ]. 



(3.31) 
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We attack the first term as follows: 



j Tr (f(U 



dU 



^ER 



dU 

Tr (f(U ■ ~p AR )f 2 F ER 
Tr (q~® 2 (JJ m ■ (p^) 12 
= J Tr [(p AR f 2 (jf/^ 2 . (Tt)® 2 (F £ )} ® F R ) 



(3.32) 



= Tr 



where we have used Lemma 3.3 in the first equality, and the definition of the 
adjoint of a superoperator in the third equality. We now compute the integral 
using Lemma 3.4: 



J u^ 2 . (pf 2 (F E )dU = aI AA ' + $F J 



(3.33) 



where a and f3 satisfy the following equations: 



a\A\ 2 + (3\A\ =Tr (Pf 2 (F E ) 



= Tr 



F E (f) m {l AA ') 
= \A\ 2 Tr[F E (Q E f 2 ] 
= |A| 2 Tr[(^) 2 ] 



(3.34) 

(3.35) 

(3.36) 
(3.37) 
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and 



a\A\ + f3\A\ 2 = Tr {t^) m {F E )F 



Tr 



2/ t?A\ 



F*(Tf z (F 



WI 2 Tr 



F E Tt AA , 



u AE )®'\F J 



tEE'- 



\A\ 2 Tr \(I AA ' ® F £ )(c^f 2 (F A ® 



; Tr[F^(£l;^) 02 ] 



WI 2 Tr 



(cu A ' E ) 2 



(3.38) 

(3.39) 

(3.40) 

(3.41) 

(3.42) 
(3.43) 



where, u AE is simply u A ' E with A and A' reversed. In the third equality, we have 
used the fact that | A\u AE is a Choi-Jamiolkowski [Cho75, Jam72] representation 
of T; the fourth equality is due to the fact that the adjoint of the partial trace is 
tensoring with the identity. 

Solving this system of equations yields 



a = Tr [(u E ) 2 } 



\ 



= Tr 



(co A ' E ) 2 



\A\TrUw A ' E ) 2 ] \ 
/ 

A 2 |A|Tr[(^] \ 
' Tr[(^) ^ 



V 



(3.44) 



(3.45) 



/ 



By applying Lemma 3.5, we can simplify this to a < Tr [(w B ) 2 ] and /3 < 
Tr [(cu A £ ) 2 ] . Substituting this into (3.32) and using Lemma 3.3 twice, and then 
substituting into (3.30) yields 



J \\T{U ■ p AR ) -u E ® (P^dU ^ 



(3.46) 



We then get the theorem by using the definitions of u, p and the definition of 

H 2 . □ 
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We now prove a version of the theorem that allows us to replace the H 2 in the 
upper bound by the smoothed versions of H 2 . Among other things, this allows 
us to use the fully quantum AEP (Theorem 2.4) and therefore to use the theorem 
directly on i.i.d. states and channels. 

Theorem 3.8. Let p AR be a density operator, T A ^ E be any completely positive super- 
operator, let cu A ' E = (T <g> I A ')($ AA '), and let e > 0. Then, 



I \\T(U ■ p AR ) -u E ® (P\\ dU ^ 2 -^'l E )"-^W)„ + 8e 

JV(A) 

where j -dU denotes the integral over the Haar measure on all unitaries U A . 



(3.47) 



Proof. Let U^ CE be a Stinespring extension of T, and let Q A ' E be such that 
d F (u), u) s$ e and H 2 (A'\E) Q = H%(A'\E) U . Also, let p AR be such that d F (p, p) < e 
and H 2 {A\R) ? = H e 2 (A\R) p . Write Q - oo = A+ - A_ where A± e Pos(A' <g> E) 
have orthogonal support. Since d F (Q, uj) < e, \\uj — uj\\\ < 2e (see Lemma 2.2) and 
HA-tl^ ^ 2e. We now define u)' :— u> — A_. By the definition of H 2 and the fact 
that Q' ^ ZD, we have that H 2 (A\E)q, > H 2 {A'\E) Q . 

LetP c < I c be a positive semidefinite operator such that Tr c [PUr-^ AA '] = cD' 
(whose existence is guaranteed by Lemma 1.3, since cD' < w) and define T(£) = 
Tr c [Pf/ r • £]. Then, using the previous theorem, we get 



2" 


\Hl{A'\E) w -\Hl{A\R) p 












/ 

/U(A) 


f (17 • p AR ) - 


u' E i 




i 






/ 


f (17 • p AR ) - 


u E § 




dU - 

i 






/ 


\T(U- P AR )- 






^[7- 


/ 

JV(A) 



-T{U-p 



AR\ 



-6e. 
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We now deal with the second term above: 



U(A) 



T(U ■ P AR ) - T(U ■ p AK ) 



AR\ 



-I 

= J Tr (F 



if - P c2 )(U T U ■ p AK ) 



AR\ 



dU 



rrC dC* 2 



p° )(U T U-p 



AR\ 



dU 



= Tr 
= Tr 



(I c -P c2 ) (ur-J U-p AR dU 
(I c -P° 2 ) {U T -7i A ®p R ) 



= Tr[u- u] 

< 2e, 

where the second equality follows from the fact that P c < I c . This results in: 

f \\T(U ■ p AR ) -oo E ® p R \ldU < 2-* H X A '\ E >»-* H '*W B >> + 8e 

JU(A) 

which concludes the proof. 



□ 



It is also possible to show that, with very high probability the value of the 
left-hand side in Theorem 3.7 is very close to its expected value. This is shown 
in the next theorem. First, however, we must define the Lipschitz constant of a 
function: 

Definition 3.2 (Lipschitz constant). Let f : X — > 2) be a function from the set X to 
the set 2) endowed with distance measures d% and c%. Then, the Lipschitz constant of f 
is defined as 



sup 



d<x)(f(x 1 ),f(x 2 )) 



d x (xi,x 2 ) 

If the above quantity is not bounded, the constant is not defined. 

Theorem 3.9. In the scenario described in the statement of Theorem 3.7, we have that 



Pr \\\T{U ■ p AR ) -w E (8)p fl || 1 ^ 2-^(A'|£)^§ff 2 (A|i?) p +r | ^ 2e~^^ 



(3.48) 
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where K = max{||T(^f)||i : X e Herm(A), ||X||i < 1}, and where the probability is 
computed over the choice of U. 

Proof. This is a corollary of Corollary 4.4.28 in [AGZ09], which states that, for a 
c-Lipschitz function / : V(A) ->■ R, 



Pr {|/([/) -E/| > 5} ^ 2e^ |A|<52/4c2 . 



(3.49) 



We are interested in the function /({/) = ||T(t/ ■ p AR ) — w E ® p R \\ ; we there- 
fore need to bound its Lipschitz constant. Let \p) ABR be a purification of p and, 
without loss of generality, assume f(U) ^ /(V), and 



/(f) - f(V) = \\T{U ■ P AR ) -u^p^l- \\T(V ■ P AR ) - u l 
^\\T(U-p AR )-T(V- p s 



AR 



<: K \\U ■ p 



ABR 



<: 2K 

= 2K 
= 2K 
< 2K 
= 2K 



V-p 



X ABR\ 



ABR I 



(U-V)\P) ,, 
0Pbr^a((U-V)\ P ) abr )\\ 2 

(U-V)o PBR ^ A (\p) ABR )\\ 2 

U-V\\ 2 \\o PBR ^ A (\p) ABR 



u-v\\ 2 ^ 



where the third inequality comes from Lemma 1.4, the last inequality is an ap- 
plication of Lemma 1.5, and || • ||oo denotes the largest singular value of a matrix. 
Hence, the Lipschitz constant of / is upper bounded by 21f-\/||p A || 0O and the 



theorem follows. 



□ 



All of the constructions given in this section involve selecting unitary oper- 
ators randomly according to the Haar measure. This wouldn't be very practical 
in real life: there is no guarantee that a matrix chosen this way could be imple- 
mented efficiently by a quantum circuit; in fact there is an extremely high chance 
that it wouldn't be. However, for most of the above theorems, there is a way out 
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of this: apart from Theorem 3.9, the Haar measure can be replaced in all of the 
above theorems by a unitary 2-design, which is defined as follows: 

Definition 3.3 (Unitary 2-design). We call a finite set ofunitaries D C V(A) a uni- 
tary 2-design if 



for every M e L(A® 2 ), where the integral is taken over the Haar measure on U(A). 

Since all of the theorems of this section with the exception of Theorem 3.9 
only involve the Haar measure in integrals of this type, the Haar measure can 
be replaced by a unitary 2-design without affecting the rest of the theorem state- 
ments. An example of a unitary 2-design is the Clifford group (the group of 
unitaries that take Pauli operators to Pauli operators) [DLT02]. 

For Theorem 3.9, which makes use of the concentration properties of the 
Haar measure, we do not yet know how to replace the Haar measure by some- 
thing constructive. 

3.3 Corollaries of the decoupling theorem 

As mentioned previously, many well-known results can be shown to be spe- 
cial cases of this theorem, including the Fully Quantum Slepian-Wolf theorem 
[ADHW06], as well as state merging [HOW07]. We present them here for corn- 



Corollary 3.10 (FQSW [ADHW06]). Let p AR be a mixed state, and A = Ai <g> A 2 . 

Then, we have that 



where the integral is over the Haar measure on U(A), and ir Al denotes the completely 
mixed state on A x . 




pleteness: 




(3.50) 
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Proof. Consider the superoperator T A_i-Al (£) = TrA 2 [£] and define u A ' Al = 
Tta 2 [® aa '], and then apply Theorem 3.7. It is easy to show that Tr[w A ' Al2 ] = 
1/ ] A 2 1 , from which the result follows. □ 

Corollary 3.11 (State merging). Let p AR be a mixed state, and let {M A ^ E : M; e 
L(A, E), i e {1, ... ,n}} be a set of measurement operators (i.e. J2i M-Mi = I A ) such 
that each Mi is a rank-\E\ partial isometry. Then, 



E 



M t U-p 



AR 



7Y 



n 



dU ^ sJ\E\2- H ^ A \ R )p (3.51) 



where we integrate over V(A). 

For simplicity, we do not consider the case where |^4| is not divisible by \E\; 
the extension to the general case is straightforward. 

Proof. Let X be an n-dimensional subsystem, and let T A ^ EX be a superoperator 
such that 

T(a A ) = KX*| X ® ( M ^ E • O ( 3 - 52 ) 



and define the state u A ' EX = T($ AA '). It can easily be shown that Tr[w A ' 
l/n, from which we get 



A' EX' 



( /[i^/ffl2- ff « (3.53) 
n 



and the result follows. 



□ 



The next corollary is an intermediate lemma from the state merging paper 
[HOW07], and also forms the basis of [HHYW08]: 

Corollary 3.12 (Random subspaces). Let p AR be a mixed state and let V A ^ E be a 

fixed rank-\E\ partial isometry. Then, 
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Proof. Consider the superoperator T A ^ E such that T(cr A ) — j^jv • cr A . The result 
follows immediately from Theorem 3.7 and the fact that T($ AA ') is a pure state. 

□ 

One can also come up with interesting blends of the above. For instance, 
one can mix FQSW and the random subspaces theorem above. The operation T 
we will consider is the following: we apply a fixed unitary operator on Alice's 
share, then restrict her system to only a subspace by applying a fixed projector, 
and then trace out part of the remaining state. We call the post-restriction system 
E, which is divided into two shares E 1 and E 2 ; we trace out E 2 and Alice is left 
with only E 1 . The result is the following protocol, which will be presented and 
shown to be essentially optimal in [BCR09]: 

Corollary 3.13. Let p AR be a density operator, and let 7~ A ^ E i be a completely positive 
superoperator such that 

T(a A ) = \§-Tt E2 [V a ^ E2 ■ a A ] 
\E\ 

where V is a partial isometry, and E = E x <g> E 2 . Then, 

[ \\T(U-p AR )-Tr El ®p R \\.dU^ \\W\ 2 ~ H2{m)p ( 3 - 55 ) 

Jv(A) " V I 2 1 

where j -dU denotes the integral over the Haar measure on all unitaries U A . 

Proof. Follows trivially from Theorem 3.7 and the calculation of 

HiiA'lEj^AA^ = log \E 2 \ - log |£x|. □ 

3.4 Quantum coding theorems via decoupling 

In this section, we use the theorems from Section 3.2 to derive a one-shot 
coding theorem for quantum channels. As explained earlier, our strategy will be 
to show that the complementary channel (i.e. the channel to the environment) 
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completely breaks all correlations with a system that purifies the input. As a 
result, we get that Bob is able to reconstruct the message. 

We will consider the following problem: Alice and Bob share a pure state 
ip ABR ; Alice holds A, Bob holds B, and R is a reference system which purifies the 
state. Alice would like to send her share of the state to Bob through a single use 
of the quantum channel Af A '^ c . To accomplish this, we need to find encoding 
and decoding superoperators S A ^ A ' and x> CB ^ AB such that 



Note that Buscemi and Datta [BD09] have considered a similar problem but 
without any B system already at Bob's. The following generalizes their result to 
the case where Bob already has a share of the system: 

Theorem 3.14. Let ijj ABR be a pure state, J\f A '^ c be any completely positive trace- 
preserving superoperator with Stinespring dilation U$~^ CE and complementary chan- 
nel J\f A '^ E , let uj a " ce — Ujy ■ a A " A ', where a is any pure state and A" = A' , and let 
e > 0. Then, there exists an encoding partial isometry V A ^ A> and a decoding superop- 
erator d cb ^ ab such that 



\\{VoMoE){i,)-i,\\ x ^e 



(3.56) 




and 



\\(V o M){V ■ ij ABR ) - i> 



ABR 




where 



,5 1 = 3x 2^ h ^ a)4, ~^ h ^ a " )lj + 24e 
S 2 = 3x 2-h He 2( A "\ E )«-h He 2( A \ R h + 24e. 



Here, 5\ determines how closely the input ip A can be made to fit the target in- 
put distribution uo A " , whereas 5 2 depends on the difference between the amount 
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Figure 3.1: Diagram illustrating Theorem 3.14. Each line represents a quantum 
system, boxes represent isometries, and the horizontal axis represents the pas- 
sage of time. Lines joined together at either end of the diagram represent pure 
states. Alice used V to encode her message A into the input to the channel A', 
and Bob uses the channel output C together with the B that he had since the 
beginning to decode A (and B) back. The decoder also produces a system F that 
purifies the environment. 

of information that must be transmitted (— Hl{A\R)^) and the information- 
carrying capability of the channel (H^A'^E)^). See Figures 3.1 and 3.2 for an 
illustration of the theorem. 

Proof. Let W A ^ A " be any full-rank partial isometry and consider the superoper- 
ator T A "^ E defined as T(0 = \A"\M(op A „^ A ,(\a)) ■ f )• Theorem 3.8 then tells us 



A' 




Figure 3.2: Diagram illustrating the states o and u in Theorem 3.14. Each line 
represents a quantum system, boxes represent isometries, and the horizontal 
axis represents the passage of time. Lines joined together at either end of the 
diagram represent pure states. 
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that 



dU 

i 



J ^\A"\M{o VAII ^ AI {\a))U A "W -^ AR ) -u E ®^ R 

= J \\T{UW-i) AR ) - u E ® ipx^dU 
<; 2-h H K A "\ E h-h H 2( A \Rh _|_ ge. 

We must now prove that there exists a U such that conjugating tjj by 
op A „^ A/ (\a))UW can be approximated by an isometry and for which 
the above inequality holds. For this, we use Theorem 3.8 again, with S A "^ G , 
£(£) = |A"|Tr[op 

A"->A'(\ a )) ' £] i s a dummy 1-dimensional system): 

J \\\A''\TT A/ [op A ,^ A ,(\a))UW-ij ABR }-ij B %dU < 2-3 h S(^^-|h S (A"|go £W +8£ 

^ 2l' ff max(^)v _ 5- ff 2( j4 ")w _|_ 

where we have used Lemma 2.8 to establish that £(<&) = w. Now, by Markov's 
inequality (Lemma 1.7), we can choose a U such that 

y IH^'lTr^fop^^da))^-^^] - ^W^dU < 3 x 2§"S-(^-^!(^ + 2 4 £ 
y ||^A/Xop A ,,^,(|<7>)f/ A 'V • ^ AR ) -u E ® ^j R ^dU < 3 x 2 -§^"l^-§ + 245. 

The first of these two inequalities allows us to use Uhlmann's theorem (The- 
orem 3.1) to find the encoding isometry: there exists a y A ~^ A ' such that 



op A „^ A ,(\a))UW ■ ip ARR - V ■ ^ AB % < 2\/3 x 2 h H ^( A h-^( A '% + 2 4e 



Using the triangle inequality and the monotonicity of trace distance under CPTP 
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maps, we get that 

[V ■4> AR )-u E ®^ R \\ l ^2^/5~ l + 5 2 . (3.57) 



To finish, we use Uhlmann's theorem again on this last inequality to get the 
decoder: there exists a partial isometry d cb ^ fab such that 



DUtfV ■ i) ABR - i EF ® ^ AB % ^ 2^/2 V5i + 5: 



>2 



for some state £ . Finally, we trace over to get the theorem. □ 

We now turn to the case of memoryless channels used to transmit arbitrary 
quantum information with some entanglement assistance. This consists of the 
following special situation: the state ip ABR has the form $ fiM ® $ AB where MA 
plays the role of A from the previous theorem, and the channel is (M A '^ c ) m . 
We will say that a pair (Q, E) is achievable if there exists a sequence of codes of 
length n, with encoders S^ nAn ^ A ' n an( j decoders pC»B n ^M„^ w {th _ _ 
2"«, \A n \ = \B n \ = 2 nE , such that 



lim 

n— »oo 



0. 



A rate Q is achievable for entanglement-assisted transmission if there exists 
an E > such that (Q, E) is achievable, and it is achievable for unassisted trans- 
mission if (Q, 0) is achievable. The capacity region is the closure of the convex 
hull of all achievable points. 

The achievability of the coherent information for unassisted transmission 
was proven with increasing standards of rigour by Lloyd [Llo96], Shor [Sho02], 
and Devetak [Dev05]. Since then, several other proofs have been published; the 
proof given below shares some similarities with the one by Hayden, Horodecki, 
Yard and Winter [HHYW08]. The entanglement-assisted capacity was first given 
by Bennett, Shor, Smolin and Thapliyal [BSST02]. Theorem 3.15, which interpo- 
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lates between those two results, can also be obtained by time-sharing between 
the completely assisted and unassisted protocols. 

Theorem 3.15. For any quantum channel J\f A '^ c ' , any -pure state a AA ' with A' = A, the 
rate pair (Q, E) is achievable for quantum transmission with rate-limited entanglement 
assistance through J\f if 

Q + E < H(A) a and Q - E < I(A)C) N{a) . 

As a corollary, if we do not limit the rate of entanglement assistance, Q < \I(A; C) A r(< T ) 
is achievable. 

The first condition (that Q + E < H(A) a ) says that both the quantum in- 
formation to be transmitted and Alice's share of the EPR pairs must fit into the 
input to the channel, and the second condition says that the channel can carry 
I(A)C) qubits per transmission when no entanglement is used, but the rate can 
be "boosted" at the rate of one ebit per qubit until the first condition is saturated. 
If we saturate the first condition, then we get the entanglement-assisted capacity 
of \I(A- C) N{a) . 

Proof. The proof essentially consists of using the previous theorem on N® n and 
using the fully quantum AEP (Theorem 2.4) to bound the various conditional 
entropies. Let U$^ CE be a Stinespring dilation of M, and let R and M be sub- 
systems of dimension 2 nQ , M storing the quantum message Alice wants to trans- 
mit, and R being its purifying system. Likewise, let A and B be systems storing 
Alice's and Bob's part of the shared entanglement respectively, both of dimen- 
sion 2 nE . Now, the input state we will consider is ^ MABR = $ RM ® & AB , where 
MA play the role of A from the previous theorem. We are now in a position 
to use the previous theorem with ip as the input state, N® n as the channel, and 
u AncnEn = Uff n ■ a® n to conclude that there exists an isometry V MA ^ A ' n and a 
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CPTP map V° nB ^ M such that 



where 



5 1 = 3x 2^ax(MA), / ,-fHf(A'% _|_ 

5 2 = 3 x 2-^2(^™|S") w -§H|(MA|ii) v , + 2A£ 

Now we simply need to ensure that both Si and S 2 go down to zero as n — > oo. 
We then have that: 

H s max (MA)^^nQ + nE (3.58) 

HI(A n )^n[H(A) a -Ai] (3.59) 

H £ 2 (A n \E r % > n [H(A\E) UM . a - A 2 ] = n [l(A)C) M{a) - A 2 ] (3.60) 

iff (MAI^ ^ -nQ + nE. (3.61) 

where the A can be computed from the statement of the fully quantum AEP 
(Theorem 2.4) and can be made arbitrarily close to zero for large enough n. 
Hence, we get that 

Si = 3x2^ Q+E - H ^ +A ^+24e 



Hence, for any pair (Q, E) such that Q + E < H(A) a and Q-E < I(A)C)x(a), 
there exists a protocol for which the error goes down to zero as n — > oo. 

To get the corollary on fully entanglement-assisted transmission, we simply 
add the two constraints to get 2Q < H{A) a + I(A)C)x^) = I(A;C)x( a ) and 
hence Q < \I(A\ C) N ^y □ 
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3.5 Destroying correlations by adding classical randomness 

In [GPW05], the authors discuss the following question: given a quantum 
state p AB ® n / how many bits of classical randomness must be added to it to turn 
it into a product state? By "adding k bits of randomness" to a quantum state, 
we mean applying one of 2 k unitaries uniformly at random to either A n or B n . 
They find a method such that, as long as k > n[I(A; B) p + S], the distance to 
a decoupled state goes to zero as n — > oo. This theorem constitutes one of the 
first direct operational intepretations of the quantum mutual information for ar- 
bitrary density operators. We can recover both this result and a one-shot version 
of it from Theorem 3.14: 

Theorem 3.16. Let p AB e D(A <g> B) be any quantum state. Then, for any e > 0, there 
exists a set of2 k unitaries U A , with k < 2H^ ax (A) p + 41og(l/e) + 4, and a £ A e D(A) 
such that 

2- k Y J u A - P AB -e®P B 

i=i 

Proof. Let P A be a projector onto a subspace of A of dimension D ^ V^, and 
let {V A }} =1 be a set of Weyl operators (unitaries such that Tr[V^%-] = for every 
% 7^ j) on the support of P A . Now, define the superoperator T A ^ A as 

T(0 = 2- k ^Vi-C 

i=i 

We can now apply Theorem 3.14 with T playing the role of Af, p AB as the input 
state and with input distribution a A " A = \^p A § A " A p A to get that there exists a 
unitary U A such that 

\\T(U A ■ p AB ) - u A ® p B \l < 2^5[ + 5 2 



< 3 x 2l[ H max(^)p-^!(^|s) P -fc+2iog(i/ e )] + 2 v / 27e + 24£:. 

i 
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where u A " A = T(<j a " a ), and 



5 1 = 3x 25^max(A) P -|iogD + 2A£ 



since i7 2 (^4'% = log-D and H 2 (A"\A) 0J = - \ogD + k. In other words, 



2- k J2v t U A -p AB -u A ®p l 



«C 2y / 5 1 +S 2 . 



We can now define U A := V A U A to get our desired set of unitaries. All that is 
left to do is to let log D = H^ ax (A) p + 2 log(l/e) to get that 5i = 27e, and hence, 
the theorem. □ 

One can also show using the fully quantum AEP (Theorem 2.4) that, for 
an i.i.d. state p AB ® n , H £ ni ^(A n ) p ^ - H £ 2 (A n \B n ) p ^ -> n[H(A) p - H(A\B) P ] = 
nI(A; B) p , which allows us to recover the theorem of [GPW05]. 



CHAPTER 4 



QUANTUM CHANNELS WITH SIDE INFORMATION AT THE 

TRANSMITTER 

4.1 Introduction 

Consider the following problem: we have a noisy quantum memory device 
that can store n qubits and that contains a certain fraction of defective cells. The 
cells that do work can be modelled as depolarizing channels, but the defective 
ones always output |0). We can test which cells are defective before writing 
to the memory device, but this information is not necessarily available when 
reading from it. What is the best asymptotic rate at which we can store qubits 
reliably on this device? This problem can be generalized to any channel where 
the transmitter has access to side information about the channel state while the 
receiver does not. 

The corresponding classical problem has been solved by Gel'fand and 
Pinsker in [GP80]. They consider channels modelled as a conditional proba- 
bility distribution Py\xs{v\x, s), x E X , s E S,y E y, where x, y and s represent 
the input, output and state of the channel respectively. The channel state is i.i.d. 
and distributed according to ps(s). The encoder has access to the entire sequence 
of channel states ahead of time whereas the decoder does not. They have shown 
that the capacity of such a channel is given by 

C= max [/([/; Y) - I(U; S)} (4.1) 

qusx^P 

where V is the set of all probability distributions onWx^x,S such that the 
marginal on S is equal to ps(s); U is an arbitrary set that can be chosen such 
that \U\ ^ \X\ + |«S|. The mutual informations are computed for the distribution 

Py\xs ■ Qusx- 

Here we shall generalize this result to quantum channels and potentially 
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quantum side-information using the methods developed in Chapter 3. Namely, 
we will prove that the entanglement-assisted quantum capacity of quantum 
channels with side information at the transmitter has the same form as (4.1) and, 
a relatively rare fact in quantum information theory, has a single-letter converse. 
Along the way, we will prove a one-shot coding theorem as well as a coding 
theorem for quantum transmission with rate-limited entanglement assistance 
for such channels, both in the same spirit as those proven at the end of the last 
chapter. 

4.2 Definition of quantum channels with side information at the transmitter 

A quantum channel with side information at the transmitter is defined by a 
superoperator j\f A ' s -^ c and a quantum state \<f)) ss '; this quantum state represents 
the side information. Alice has access to S' and can input a state of her choice 
into A'. One way to view this is to say that Alice shares entanglement with the 
channel itself. This framework allows us to consider both quantum and classical 
side information about the channel in a unified manner. 

To illustrate this, consider the example of the depolarizing channel with 
defects given in the introduction. For this case, we can choose |0) to be 
•y/plOO) + v 7 ! — p| 11) . The superoperator J\f then measures the S subsystem, and 
outputs |0) if the outcome is 0. If the outcome is 1, it applies the depolarizing 
channel to A' and sends the output to Bob. 

In this chapter, we will first be interested in the following one-shot task: Alice 
and Bob initially share the A and B parts of the state ifj ABR , and Alice would 
like to use the channel [J\f A>s ^ c ', \<p) ss ') to send A to Bob. Hence, we want to 
ascertain the existence of an encoder £ AS '^ A ' and decoder x> CB ^ AB such that 



with a e small enough for our purposes. 

We will then specialize our one-shot theorem to the i.i.d. case, in which 
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the state ^ MA ~ BR has the form $ RM <g> $ AB where MA plays the role of A, and 
the channel is A/" 0n . We will say that a pair (Q, E) is achievable if there ex- 
ists a sequence of codes of length n, with encoders £M n A n s ,n ^A ,n an( j decoders 

Df B "^ M ", with \M n \ = \R n \ = 2 nQ , \A n \ = \B n \ = 2 nE , such that 



lim 

n— >oo 



= 0. 

1 



A rate Q is achievable for entanglement-assisted transmission if there exists 
a E ^ such that (Q, i?) is achievable, and it is achievable for unassisted trans- 
mission if (Q, 0) is achievable. 

The capacity region is the closure of the convex hull of all achievable points. 

The goal of this chapter is to establish the following theorems: 

Theorem 4.1. Let (Af A ' s ^ c , \<f)) ss ') be a quantum channel with side-information at 
the transmitter, and let a AA ' s be any mixed state with a s = <p s . Then, any rate point 
(Q, E) such that 

Q + E <H(A\S) a and Q - E < I{A)C) N{c) 

is achievable for transmission with rate-limited entanglement assistance. As a corollary, 
any rate Q such that Q < \[I(A; C)^( CT ) — I(A; S) a ] is achievable for entanglement- 
assisted transmission. 

Theorem 4.2. The entanglement-assisted quantum capacity of a quantum channel with 
side information at the transmitter (Af A ' s ^ c , \<f)) ss ') is 

C = sup ^I(A; C% - l -I(A; S)^ . (4.2) 

The supremum is taken over all mixed states of the form a AA ' s where a s = <p s and 
u AC = J\f A ' s ^ c (a AA ' s ). In other words, the previous theorem with an i.i.d. input 
distribution is optimal for coding for entanglement-assisted i.i.d. channels. 

This theorem also entails that the entanglement-assisted classical capacity of 
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quantum channels with side information at the transmitter is 

C = sup {I(A; CVw - HA; S) a } (4.3) 
<j 

via super-dense coding. 



4.3 Direct coding theorem 

We begin with the one-shot coding theorem: 

Theorem 4.3. Let ip ABR be a pure state, (Af A ' s ~^ c , \<f)) ss ') be any channel with side- 
information at the transmitter with U$ S ^ CE as Stinespring dilation, and let UJ A " CED = 
• a A " A ' SD , where a is any pure state with o s = <p s . Then, there exists a encoding 
CPTP map £ AS '~> A ' and a decoding CPTP map T> CB ^ AB such that 



(VoMo S)(^ ABR ® <p ss ') ABR - ^ ABR 



^ 2\/2 v /5 1 + 5 2 

i 



where 



51 = 3 x 2 1 2 H ™x( A U-h H i( A "\ s )° + 24e 

5 2 = 3x 2-^ H 2( A "\ ED ^--2 H 2( A \Rh + 24e. 



Hence, to have a good code, one must ensure that both <5i and 5 2 are suffi- 
ciently small. Both of these quantities have natural interpretations: 5i character- 
izes the difference between how "big" the message is (H max (A)^) and how much 
space there is in the input to the channel {Hl(A"\S) a ), and S 2 depends on the dif- 
ference between how hard the state is to transmit (— H?,{A\R)^) and how good 
the channel to the environment is at destroying correlations (H^A'^ED)^. See 
Figures 4.1 and 4.2 for illustrations of the protocol. 

Proof. Let W A ^ A " any full-rank partial isometry, and consider the superoperator 
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Figure 4.1: Diagram illustrating Theorem 4.3, with encoder, channel and decoder 
purified. Each line represents a quantum system, boxes represent isometries, 
and the horizontal axis represents the passage of time. Lines joined together at 
either end of the diagram represent pure states. V represents Alice's encoder: 
she uses the side information S' to encode the message A into the channel input 
A' and discards a system D. The decoder D takes the channel output C together 
with Bob's initial system B and produces A and B as output; the result being 
close to the initial state ip. 




Figure 4.2: Diagram illustrating the states uj and a which define the input distri- 
bution in Theorem 4.3. Each line represents a quantum system, boxes represent 
isometries, and the horizontal axis represents the passage of time. Lines joined 
together at either end of the diagram represent pure states. 
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T A"^ED defined as 

T(0 = \A"\Trc[U„op An ^ A , SD (\(T))-t]. 
Theorem 3.8 then tells us that 

J \\T{UW ■ i) AR ) - cu ED <g> ^ 2-* H K A "\ ED ^-* H K A \ R >+ + 8e. 

We must now prove that there exists a U such that conjugating ip by 
a/| A" | op A //^. A / 5D (|cr))[/VF can be approximated by an isometry of the form 
V^s'^a'd acting on ip ABR ® 5S " and for which the above inequality holds. For 
this, we use Theorem 3.8 again, with £ A "^ S , £(£) = \A"\ T^a'd[oPa"^a , sd(\ (J ))'£\ : 

J \\\A"\Tr A , D [ov A ,,^ A , SD (\a))UW-^ R ] - i/j br ® ^^dU 

<; 2-k H tt A \ BR )+-k H 2( A "\ s )u + 8e 
^ 2^S 1 ax(^)v-|^!(^"|S) w + g £ 

where we have used Lemma 2.8 to deduce that £(<&) = w. We would like to have 
a £/ A " that satisfies both inequalities. We can do this using Markov's inequality 
(Lemma 1.7): there exists a U A " such that: 

J \\T{UW ■ i> AR ) - co ED ® ip^dU < 3 x 2-3^(^"l^-iHS(A|fl)^ + 24£ 

= <5 2 

and 

1 1| \A"\ Tr A , D [op A ,^ A , SD (\a))U ■ ^ ABR ] - ip BR ® ^ dtf 

^ 3 x 25- ff ma X (- 4 )v-|- ff 2( yl "l'S')" _|_ 24^ 

= 5i- 
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This last condition allows us to use Uhlmann's theorem (Theorem 3.1) to find 
the encoding isometry: there exists a V AS '^ A ' D such that 



l^'l op A ,^ A , s M)UW ■ i> ABR - V ■ (^ ABR ® 0< 



SS' 



Using the triangle inequality and the monotonicity of trace distance under su- 
peroperators, we get that 



{U N V ■ ^ AR <g) <P SS ')) EDR - u ED <g> ^ R 



^ 2y/6i + 6 2 



Finally we use Uhlmann's theorem a second time to get a decoding partial 
isometry D CB ^ ABG : 



du n v ■ (f ABR ® 5s ')) - e ED ® ip AiiH 



^2V2 v / 5i + 5 2 



for some state £, GED . We then take a partial trace over GED inside the trace 
distance to get the theorem. □ 

We now move on to the memoryless case: 

Theorem 4.4. Let (Af A ' s ^ c , \<f)) ss ') be a quantum channel with side-information at 
the transmitter, and let a AA ' s be any mixed state with a s = <p s . Then, any rate point 
(Q, E) such that 



Q + E < H(A\S) a and Q - E < I{A)C) N{c) 

is achievable for transmission with rate-limited entanglement assistance. As a corollary, 
any rate Q such that Q < \[I(A; S) a — I(A; C)m(cj)\ is achievable for entanglement- 
assisted transmission. 



Again, the first condition corresponds to how closely we can make the input 
fit the target input distribution a, and the second one is the limit imposed by the 
channel noise. Once again, we can trade ebits for qubits at a one-to-one ratio 
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until we reach the limit imposed by the first condition. 

Proof. The proof essentially consists of using the previous theorem on J\f® n and 
using the fully quantum AEP (Theorem 2.4) to bound the various conditional 
entropies. Let R and M be subsystems of dimension 2 nQ , M storing the quan- 
tum message Alice wants to transmit, and R being its purifying system. Like- 
wise, let A and B be systems storing Alice's and Bob's parts of the shared en- 
tanglement respectively, both of dimension 2 nE . Now consider the input state 
^mabr _ qrm ^ qAb^ w h ere ma plays the role of A in the one-shot theorem. 
We now use the previous theorem with ijj as the input state, A/"® n as the channel, 
and (jj AncnEn£>n — j\[®n(^ a ®n^ j. Q conc i uc i e that there exist encoding and decoding 

CPTP maps S MAS '^ A ' n and V° nB ^ AB such that 
I (V o A/"®" o S) [^ MABR ® ((j) ss 'f n ^ - i> 

with 

5 ± = 3 x 2^ H ^ {MA) ^ H ^ An \ sn) °® n + 24e 
S 2 = 3 x 2-^ An \ EnDn ^-^ Mi \ R h + 24e. 



^ 2\/2 



+ 5 2 



We can bound all the entropic terms above. We have that: 

H^(MA)^^nQ + nE (4.4) 

H e 2 {A n \S n )^ n > n [H(A\S) a - Ai] (4.5) 

H*(A n \E n D n )„ > n [H{A\ED) M{a) -A 2 ]=n [l{A)C) N{a) - A 2 ] (4.6) 

HKMAIR)^ ^ -nQ + n£ (4.7) 

where the A can be computed from the statement of the fully quantum AEP 
(Theorem 2.4)) and can be made arbitrarily close to zero for large enough n. 

Hence, as long as Q + E < H(A\S) a - A 1 and Q - E < I(A)C)^ a ) - A 2 , 
both 5i and 5 2 go down to zero as n grows. The first condition corresponds to 
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the fact that the message qubits and Alice's share of the entanglement must fit 
in the input, and the second condition means that the transmission rate minus 
the amount of entanglement must not exceed the coherent information. 

To get the entanglement-assisted rate, we can simply add the two inequalities 
so as to eliminate E. The result is that 2Q < H(A\S) a + I{A)C)^^ a ) and a simple 
calculation reveals this to be equivalent to Q < \[I{A\ C)x(a) — I (A; S) a ]. 

□ 



4.4 Optimality for entanglement-assisted coding 

We shall now prove that the previous theorem is optimal for entanglement- 
assisted coding. In other words, for any achievable rate Q for entanglement- 
assisted transmission, there exists a state a AA ' s as in Theorem 4.2 for which Q = 
\I(A\ C)^(ct) — S) a - This part is essentially the same as in [GP80], with a 
few adaptations to the quantum case. In particular, one must pay close attention 
to which state the various mutual informations are defined on, since we will be 
dealing with states where only some fraction of the n instances of the channel 
has been applied. 

Theorem 4.5. The entanglement-assisted quantum capacity of a quantum channel with 
side information at the transmitter (Af A ' s ^ c , \<f)) ss ') is 

C = sup ^I(A; C% - l -I(A; S)^ . (4.8) 

The supremum is taken over all mixed states of the form a AA ' s where a s = <p s and 

UJ AC = Af A ' S ~> C ((T AA ' S ). 

Proof. The achievability of this rate follows directly from Theorem 4.4. We there- 
fore now need to prove that one cannot go above this rate. First, let £ M nA n s' n ->A' n 
and x> c " Bn ^ MnBn be the encoder and the decoder respectively of an arbitrary 
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code of block size n with log \ R n \ — log \M n \ = nQ such that 



let ff^ B " A '" s " = <g> $^ <g> and = A/"® n (cr)- Then, by 

Fannes' inequality (Theorem 1.9) and the monotonicity of the mutual informa- 
tion (see Section 2.4.2) we must have that 



I(R n ;C n B n )„^2n(Q-d(e,n)) 



(4.9) 



where d(e, n) := + 5e1°££. Notice that 



7(i? n ; B n C n ) w = I(B n ; i?„) w + J(-R n ; C"|-Bn)o 
= /(-R„; C n |-B n ) w 
< I(R n B n ; C n ) 



(4.10) 
(4.11) 
(4.12) 



where (4.11) is due to the fact that R n and B n are independent. Combining this 
with I(R n B n ; S n ) a = 0, we have 

I(R n B n , C n ) w - I(R n B n ; S n ) a > 2n(Q - d(e, n)). (4.13) 

We will now introduce a few shorthands which will make the notation consid- 
erably less cumbersome: we will write C % instead of d, . . . , Ci and C\ instead of 
Cj, . . . , Cj, and likewise for S. Define also 

X(i) := RAC^S^ (4.14) 
Y(i) := R n B n S? +1 (4.15) 



Note that these are nothing more than groupings of subsystems. We also define 
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the following sequence of states: 

:= (J\f® i <g) I® n_i )((r) (4.16) 

In other words, u(i) is the result of applying the first % instances of the channel 
to the state a. 

We shall now prove the inequality 

I(R n B n ] C n ) w — I(R n B n ] S n ) a 

n 

^ {I(X®\ Ci) u{i) - I(X(i); $)«(*-!)} • (4.17) 
i=i 

Since each term in this sum is of the form I (A; C)x(a) — I (A; S) a for some a AA ' s , 
the highest term is achievable by the direct coding theorem and therefore there 
exists a state for which Q < I (A; C) A r( CT ) — I (A; S) a . This allows us to conclude 
the theorem. 

We now proceed in exactly the same way as in [GP80] to establish (4.17): we 
consider the inequality 

I(Y(i);C i ) u{i) -lOr(i);S i U i . 1) 

< [l(Y(i - 1); C*- 1 )^-!) - I(Y(i - 1); S^ 1 )^)] 

+ Ci) w (i) - /(*(*); Si) w(i -i)] • (4-18) 

Summing up all these inequalities from % — 2 to % — n, we obtain (4.17), since 
Y(n) = R n B n and Y{1) = X(l). 

Now, to prove (4.18), we use the following identities which follow from the 
definitions of X(i) and Y(i) and from basic properties of the mutual information: 
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I(Y(i);C% 


(0 


= /(r« ; c i - 1 x 


j{i) + I(Y(i);C i \C i - 1 Ui ) 


(4.19) 




I(Y(i);S% {i _ 


■i) 


= I(Y(i);SX(i- 


. 1) + I(Y(i);S i - 1 \SX(i-i) 


(4.20) 


I(Y(i 


- 1); S 1-1 )^- 


■i) 


= I(X(i);S i - 1 \i 


'i)u(i-l) 


(4.21) 


I{Y{i 


- 1); C" -1 )^(i- 


•i) 


= I (Si; C % l )uj{i- 


_ 1) + /(y(i);C7 i - 1 |S' i ) w (i-i) 


(4.22) 




I(X(i);Ci) u 


(0 


= 7(C* _1 ; 


) + 7(y(i);C i |C i - 1 ) u;W 


(4.23) 




I(X(i);SX(i- 


■1) 


= I(C l 1 ; Si)u{i- 


_ 1) d + /(y(i);S' i |C7 i - 1 ) w(i _ 1) . 


(4.24) 



Substituting these into (4.18) and using the identity 



/(A; 5) - I(A; B\C) = I(A; C) - I(A; C\B) (4.25) 

which holds on any mixed state p ABC , we get that the difference between the 
right-hand side and the left-hand side of (4.18) is I(C % ~ 1 ; Cj) w (i), which is always 
nonnegative. This concludes the proof. □ 

4.5 Discussion 

This result further strengthens the parallel between classical information the- 
ory problems and their entanglement-assisted quantum counterparts. Indeed, 
the capacity formula (4.2) has the same form as the classical version (4.1); the 
same phenomenon arises in the case of the entanglement-assisted capacities of 
regular point-to-point channels [BSST02], multiple-access channels [HDW08], 
and, for the best coding theorem we know, broadcast channels (see Chapter 5). 
It would be particularly interesting to have a systematic way in which classi- 
cal coding theorems could be transformed into entanglement-assisted quantum 
protocols as it would enable us to import much larger classes of results from 
classical information theory into the quantum world. 

Returning to our result, there is one remaining issue that one would like 



68 



to solve in order to have a fully satisfactory characterization of the achievable 
rate region: we currently have no upper bound on the dimension of the A sys- 
tem needed to achieve the capacity in expression (4.2). Thus, despite having a 
single-letter converse, we unfortunately do not have a way to compute the ca- 
pacity. In the classical case, it is possible to use Caratheodory's theorem [Car07] 
to bound the cardinality of U in the optimal input distribution. However, in the 
quantum case, this approach fails due to the fact that the quantum conditional 
entropy cannot in general be expressed as H{A\B) = Y^, b p{b)H(A\B = b). On the 
other hand, there is little reason to believe that large dimensions are necessary 
to achieve the optimal rate, but we do not know how to prove that this is not the 
case. In fact, one encounters a very similar difficulty when trying to calculate the 
squashed entanglement [CW04] of a particular state since we have no bound on 
the size of the subsystem we need to condition on. We therefore leave this issue 
as an open problem. 

One might also wonder about a related problem: whether the capacity can 
in general be achieved by optimizing only over pure states a AA ' s . This would 
imply an upper bound on \A\. However, one can show that this cannot be the 
case: take, for example, a qubit-to-qubit channel which applies one of the four 
Pauli operations with equal probability, but where S tells the transmitter which 
one of the four operations is applied. The capacity of such a channel is clearly 
one qubit per transmission. Suppose that this rate is achievable using a pure 
state a AA ' s . Then, we must have \I(A\ C)x(a) = 1 (since C is two-dimensional) 
and therefore \I{A\ S) a = 0. However, this last equation together with the fact 
that a is pure implies that the purification of S must be entirely in A'. This is 
impossible since S is maximally mixed over a four-dimensional system whereas 
A' is two-dimensional, and hence the optimal a cannot be pure. 



CHAPTER 5 



QUANTUM BROADCAST CHANNELS 

5.1 Introduction 

Discrete memoryless broadcast channels are channels with one sender and n 
receivers, modelled using an input set X, output sets 2)i, . . . ,2) n/ and a prob- 
ability transition matrix p(yi, . . . ,y n \x). When the transmitter selects the in- 
put symbol x £ X, the output at the receivers is distributed according to 
p(yi, . . . ,y n \x = x ). These can represent, for instance, a radio tower broad- 
casting a signal to many receivers, each of whom experiences different signal 
corruption due being closer or further away from the tower, or due to the prox- 
imity of buildings. There are many natural tasks that one may want to perform 
using these channels, such as sending common messages to all the users, send- 
ing separate information to each user, sending data to each user privately, or 
some combination of these tasks. Here we shall focus only on sending separate 
data to two different receivers that we will call Bob 1 and Bob 2. 

One should note in passing that while this definition of broadcast channels is 
standard in electrical engineering, it may strike computer scientists (and partic- 
ularly cryptographers) as bizarre. Indeed, computer scientists are used to defin- 
ing broadcast channels as a task to be performed: send the same message to 
multiple parties, with no notion of noise. Here we think of broadcast channels 
more as physical objects: a physical channel with one input and multiple out- 
puts, with which we may want to perform a number of different tasks. 

Broadcast channels were first introduced by Cover in [Cov72], where he sug- 
gested that it may be possible to use them more efficiently than by timesharing 
between the different users. Since then, several results concerning broadcast 
channels have been found, such as the capacity of degraded broadcast channels 
(see, for example, [CT91]). Furthermore, these results form the basis of many 
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protocols that are currently used in real multiuser systems, such as cellphone 
networks. 

The best achievable rate region known for general classical broadcast chan- 
nels is due to Marton [Mar79]: given a probability distribution p(x,u 1 ,u 2 ) = 
p(ui,u 2 )p(x\ui,u 2 ), the following rate region is achievable for the general two- 
user broadcast channel p(yi, y2\x): 

^ i?i ^ I(Ui, Yi) 

^ R 2 ^ I(U 2 ;Y 2 ) (5.1) 
i2i + R 2 < /(C/i; Y 1 ) + I(U 2 ; Y 2 ) - 7(C/ i; U 2 ) 

It is conjectured that this characterizes the capacity region of general two- 
receiver broadcast channels, but despite considerable efforts, no one has been 
able to prove a converse theorem. 

The quantum generalization of broadcast channels was first studied in 
[AS00] and [YHD06] as part of a recent effort to develop a network quantum 
information theory [DH06a, DH06b, YDH08, LOW06, HIN+07, WinOl, KliOl, 
SVW05, GS07]. In [YHD06], the authors derived three classes of results, the first 
one about channels with a classical input and quantum outputs, the second one 
about sending a common classical message while sending quantum information 
to one receiver, and the third about sending qubits to one receiver while estab- 
lishing a GHZ state with the two receivers. 

In this chapter, we study quantum broadcast channels using the general tech- 
niques developed in this thesis. We look at the case where Alice initially shares 
a tripartite state vpf lBlRl with Bob 1 and a reference, and would like to transfer 
her share A 1 to Bob 1 using the broadcast channel. She would simultaneously 
like to do the same with ^' lB ' lR ' 1 and Bob 2. We first give a one-shot theorem 
for this task, and then specialize it to the i.i.d. case (i.e. the channel has the form 
J\f® n ) in which ^i consists of a maximally entangled pair between Alice and a 
reference, and separate maximally entangled pairs between Alice and Bob 1; 
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the same goes for ip 2 and Bob 2. This corresponds to transmitting qubits with 
rate-limited entanglement assistance to Bob 1 and Bob 2 simultaneously over 
the broadcast channel. When we use the maximum possible amount of entan- 
glement assistance, we recover a quantum version of Marton's region. On the 
other hand, when no entanglement assistance at all is used, the rate region does 
not appear to have any independent constraint on the sum rate; the informa- 
tion going to Bob 1 and to Bob 2 appear to "talk past each other". Interest- 
ingly, it turns out that the same phenomenon occurs in the classical scenario 
of Gaussian multiple-antenna broadcast channels (a particular type of classical 
broadcast channels) with confidential messages [LLPS09]. This is perhaps not so 
surprising, since private classical communication tends to be the closest classi- 
cal parallel to quantum communication, in which one must inherently keep the 
information private from the environment. 

We then prove a regularized converse for the fully entanglement-assisted 
case, and give an example of a channel for which the single-letter region is opti- 
mal. 

5.2 Definitions 

Here we define the various concepts needed for this chapter. 

Definition 5.1 (Quantum broadcast channel). A quantum broadcast channel is a 
CPTP map with more than one subsystem as its output, and whose outputs are held by 
separate receivers. 

In the one-shot case, we will be interested in the following situation: the 
initial state is ^ lBlRl <g> ip 2 2B2R2 , where A x and A 2 are held by Alice, B x and B 2 
by Bob 1 and Bob 2 respectively, and R\ and R 2 are reference systems making 
the states pure. Alice wants to use the broadcast channel A/" A '^ ClC2 to send Ai 
to Bob 1 and A 2 to Bob 2 (of course, Bob 1 gets C\ and Bob 2 gets C 2 ). Hence, 
we will need to assert the existence of an encoding superoperator £ A i A ^ A ' and 
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decoders v BlCl ^ AlBl and V ^ C ^ MB2 such that 

(T>i ®V 2 )oMo£) (^ B ^ <g> ^ B ^) - ^ B ^ <g> ^ 2fi2i?2 1| 1 < 5 



for some 5 that we find suitably small. Note here that the two decoders V\ and 
V 2 commute and can be applied in parallel. 

In the i.i.d. case, we will want to use the broadcast channel J\f A '^ c ^ n times, 
to transmit separate arbitrary quantum data to Bob 1 and to Bob 2, with separate 
preshared entanglement with Bob 1 and Bob 2. In other words, ^ lAlBlRl — 
QRiMr §a 1 b 1 w h ere M±A 1 plays the role A\, likewise for ip 2 . Now, for a given 
protocol for this task, we define the transmission rate to Bob 1 (resp. Bob 2) Qi 
(resp. Q 2 ) as \ log \Mi\ (resp. \ log |M 2 |) and the entanglement consumption rate 
to Bob 1 (resp. Bob 2) as E 1 = \ log (resp. E 2 = \ log \A 2 \). 

We say that a four-tuple (Q 1 ,Q 2 ,E 1 ,E 2 ) is achievable if there exists a se- 
quence of encoders £^~ A ^ A ' n and decoders V % B ^ M ^ and pjf 2 "^ 2 '", 
such that 



lim 

n—^oo 



with 



((Pl, n <g> Z> 2l n) O A/"®" O £ n ) (V lMl ® <g> $^ M2 ® $ A2 ® B2 ) - $ filMl ® $ R2M2 



Qi = lim -log|Mi n | = lim — log | -Ri,^ | 

n^oo n— >co fl 

Q 2 = lim -log|M 2 , n | = lim -log|-R 2 , n | 

n^oo fl n~ >oo fl 

E x = lim -log|A ln | = lim -log|S ln | 

n^oo fl n— >co fl 

E 2 = lim -log|A 2n | = lim -log|S 2 „|. 

n^oo fl n—tco fl 



5.3 Direct coding theorem 



We start by proving the one-shot version of the protocol. First, however, we 
prove a simple technical lemma: 
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Lemma 5.1. If we have density operators p ABC , a A , u BC , r , r] c such that 



\p A *c -a A ®u BC \\ 1 ^e l 

\\pABC _ T AB^ r] C^^ £2 



then \\p ABC - a A ® r B ® rf^ ^ 2e x + e 2 . 
Proof. 

\\p ABC - a A ®r B ®rf\ x < ||p ABC - a A Qu 130 ^ 

+ ||<t a <g> tu BC -a A ®r B ® ?7 C '|| 1 
= £1 + 11^°-^®^^ 

< £i + ||w i?c -p BC7 || 1 + IIp^-t 5 ®^^ 

< 2ei + e 2 

where the first two inequalities are applications of the triangle inequality, and 
the equality is due to the fact that \\A\\i = \\a <S> A\\i for any operator A and 
density matrix a. □ 

Theorem 5.2. For any quantum broadcast channel J\f A '^ ClC ' 2 , any pair of pure 
quantum states ip AlBlRl and if) A2B2R2 , any pure quantum state a A " A 2 A ' D and any 
e > 0, there exists an encoding superoperator £ A ^ A ^ A ' and decoding superoperators 

v C lBl ^A lBl md V C 2 B 2 ^A 2 B 2 such tM 

\\({T>i (g) V 2 ) oJ\fo S)(^f lBlRl <g) i) A2B2R2 ) - ^ B ^ ® ^ 2B2R % 

< 4Y // 2 v / ^ + ^ + 2y / 2v / ^ + 5 2 

where 

S enc = 4 x 2^™ ax ^ Al ^ 1 ~^ H m i " 2 ° ( ' yl ' 1 '' A ' 2 '- ) ' T + 5 x 2^ H ^ ( - A2 ^ 2 ~ 'h H ^in{ A 2)" _|_ 72 £ 
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v A2B2R2 



Figure 5.1: Diagram illustrating Theorem 5.2, with encoder, channel and de- 
coders purified. Each line represents a quantum system, boxes represent isome- 
tries, and the horizontal axis represents the passage of time. Lines joined to- 
gether at either end of the diagram represent pure states. W represents Alice's 
encoder: she encodes the messages A 1 and A 2 into the channel input A' and dis- 
cards a system D. The decoders D 1 and D 2 take the channel outputs C\ and C 2 
together with Bob 1 and 2's initial systems B\ and B 2 and produce A\B\ and 
A 2 B 2 as output; the result being close to the initial state ip. 



and 



5 1 < 4 x 2-5^mirK'l^ , 2 , c 2 )^. CT -|^ in (A 1 |ij 1 )^ 1 +32e 



5 2 < 5 x 2-3^ ^ A 'i\ EDA " c ^u M -a-\H^{A 2 \R 2 )^ +4Qe _ 



See Figure 5.1 for an illustration of the purified version of the protocol. 

Proof. Let U^ Cl ° 2E and w MM ^ A ' D be Stinespring dilations of J\f and £ re- 
spectively (it will turn out later that the encoder indeed only needs to discard a 
system of size D). To be able to assert the existence of the two decoders, we need 
to ensure that the two associated decoupling conditions are fulfilled. Those two 
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conditions stipulate that there must exist states £1 and £ 2 such that 

{U N W ■ ® ^ 2 )f^C 2 B 2 ED _ ^ ^R 2 C 2 B 2 ED 

{U N W ■ ® ^ 2 )f^ lBl ED _ ^R 2 ^ ^C lfllB D 



are both appropriately small. 
To ensure this, let vf^ A 

\^A'{B lRl = Vi ^A lBlRl and \^A'^B 2 R 2 = y 2 ^A 2 B 2 R 2f and define thg stateg 



A i )A" A> >A" 

To ensure this, let V 1 1 and V 2 2 be any full-rank partial isometries, 



\ Ul (u 2 ))W D ** = v^(op^^(l-))^^ 2 ) A ' 2 ' B2/?2 ) 

| W2(t/l)> ^ MlBl = ^ (op A ^ A , D (\a))uf\^ B A 



We now use Theorem 3.8 to get 



C 2 EDR 2 B 2 



C / „ ~ 4 //o R \ RiEDR 2 B 2 C 2 

-^A'DR 2 B 2 

<C 2"^-- (A " |SDR2 - B2C2) ^-l(^)-|^min(^ll«l)^l + g£ (5.2) 

and 

J (|^|C^OP^ MlBl (|«2 W»^2 • ^ 2 ' K2B2 ) i?2£Di?lBlCl - ^ 2 ® W2 (£/ 1 ) 0lSMlBl 

<; 2-^mi„(^2l^iBiC 1 ) c , Ar . W2 -|^ in (A 2 |fi 2 )^ 2 + g£ ( 53 ) 

Note that the first states on the left-hand side of both inequalities are actually 
the same state written differently, namely | A" \ \ A' 2 ' \ {U^ op A ,, A ^ A , D (\a) ) (Ui <g) U 2 ) ■ 
(ipi ® ip 2 )) (see Lemma 2.7). This is close to what we need, but there are still 
two problems: the encoder in the above is not an isometry, and the smooth-min- 
entropies should be in terms of ■ a rather than Uj^ ■ Ui and ■ oo 2 . To solve 
the first problem (and temporarily exacerbate the second!), we use Theorem 3.8 



1 



dU 2 

1 
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again to get 



<C 2-^minK'l«2S 2 L l(c/2) -|^ in (A 1 |R 1 B 1 )^ 1 +g£ 



d£/i 

i 



and 



I x/zi / /, \ Ur JA%B 2 R 2 \ R2B2 ,r 2 b 2 

\M\^A'^A'(A'd{W))U2-^2 ) -l/> 2 



dU 2 



Note that, in this last inequality, the first state on the left inside the trace norm is 
simply uoi, we can therefore use the triangle inequality to get 



< j 2 _ ^minK'|R2B2L l(C / 2 )-^max(^l)^ lc ;f/ 2 _|_ 2~ 3 H min ( A 2% - f ^« (^2 )* 2 _|_ 16 £ 



dU x dU 2 

i 



This will allow us to solve our first problem: we will use Uhlmann's theorem 
on this last inequality to obtain our encoding isometry w AlA2 ^ A ' D . But before 
doing this, we will turn our attention to the second problem, namely that of 
bounding the min-entropies on Ui(U 2 ) and u 2 (Ui). 

There are three such problematic min-entropies: H^ lin (A"\R 2 B 2 ) u)1 (U2) 
in this latest inequality, as well as H^ lin (A'{\EDR 2 B 2 C 2 )u M -u 2 (u 1 ) and 
H^^AzlEDRiBiC^u^.^^) i n Equations (5.2) and (5.3) respectively. We 
will first deal with the first one explicitly; the same technique applies to the 
other two. 

Let a A " A 'z A ' D be a state such that \\a - <r||i < 2e and H min (A'(\A'^) a = 
H^(A>l\A>£ a , and let T A '^ R ^ be defined as T(fl = Kltop^WI^)) • 
(and hence, tui(U 2 ) = T(cr)). Furthermore, let 9 A '^ be a positive semi-definite op- 
erator such that d" A " A ' 2 ' < I A " <g> 6 A * , with Tr[0] = 2- H ^ A " I^V Then, since T is 
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A" 

completely positive, we have that, for any U 2 2 , 



Hence, if we could be certain that T(U 2 ■ a) A " R2B2 is within e in fidelity distance 
to T(U 2 ■ a), we would have shown that Hl lhl (A'{\R 2 B 2 ) Ul{U2) > H^(A![\A!£) a 
for any U 2 . Unfortunately, things are not so easy for us: we will instead have to 
show that, when averaging over U 2 , the fidelity distance is not too bad. Note that 
first that / T(U 2 ■ t)dU 2 = Tr[£]T(7r A 2 ). Likewise, letting a - a = A+ - A_ with 
A ± positive semi-definite and having disjoint support, and with Tr[A±] ^ 2e, 

j \\T(U 2 -a)-T(U 2 -a)\\ 1 dU 2 = J \\T(U 2 -(A + -A_)\\ 1 dU 2 

< J Tr[T(U 2 • A+)] + Tr[T(C/ 2 • A_)]cfl7 2 

and therefore 



/ 



d F (T(U 2 ■ a), T{U 2 ■ a))dU 2 ^ 2^~e. (5.4) 



Hence, on average, the fidelity distance is not too bad and we conclude that, on 
average, 

H^{A[\R2B 2 )u> 1 {U2) ^ #min(^l 1^2% ■ 

Since we want a bound on H^ in (A'(\R 2 B 2 ), we can state this as 

H^(A'I\R2B 2 ) 2 H 6 2\A'l\A^ a . 



We can also use the same trick on the other two smooth min-entropies. We 
now have three inequalities that we want U 1 to satisfy and four more inequalities 
that we need U 2 to satisfy. We can use Markov's inequality (see Lemma 1.7) on 

A" A" 

these to show that there exist a U 1 1 and a U 2 2 that satisfy all of them. The 
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inequalities that must be satisfied by U\ are the following: 



I a in ( tt H fTT\\\TT l A '(^ B i \ R ^ EDR ^ B ^ Rl EDR 2 B 2 C 2 



< 4 x 2~^-- (A " |ii2 - B2) -i( c/ 2)-^m i n(^il«i- B i)*i _|_ 32 e (5.6) 



and 

(The last one is actually a bound on a fidelity distance as in (5.4)). 

A" 

Likewise, there must exist a U 2 2 such that 

< 5 x 2-s H ^M\ EDRlBlCl ) u x-^-s H ^ Al >\ Ra )i» l + 40e (5.7) 



I4'l (<>P^»^(k»^ • i/V^)^ - ^ i ^2 

^ 5 X 2-5^ in K') CT -|^ in (A 2 |R 2 B 2 )^ 2 + 4 q £ ^gj 

2- ff minK'l-R2-B 2 ) wl([ , 2) ^ 2 - H ^ 20(A "W ) ° (5.9) 

2-' ff minK'l-EDii 2 B 2 C 2 )^. wl([72) ^ 2 -H e 2i 2 \A'(\EDA-C 2 ) UM . a _ 
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Now, we can combine Equations (5.6), (5.8), and (5.9) to get 



< 4 x 2~* I & a0 ( A "M ) °-* H ^ Al \ RlBl) *i 

+ 5 X 2"5^minK') CT -5^min(^l-R2B 2 )^ + . (5.11) 

Using Uhlmann's theorem, we finally get our encoding isometry w A ± A ^ A ' D : 

K'l (op^ 2B2 ^(k>)£/i • ^f"^)™ lBlA ' D _ ^ . (^1*1* ^ 2 * 2S2 ) 

^2v^ (5.12) 

where S enc is defined as the right-hand side of (5.11). Finally, the two decoupling 
conditions (5.5) and (5.7) together with Uhlmann's theorem and Lemma 5.1 yield 
the existence of the two decoders. □ 

We can now use this to prove an i.i.d. version for both entanglement-assisted 
and unassisted coding: 

Theorem 5.3. Let j\f A '^ c ^ be a quantum broadcast channel with Stinespring exten- 
sion U$^ ClC2E , let a MMA ' D be any pure state, and define p a ^CiC 2 ed = jj m . a _ 
Then, all rates satisfying 

0^Q 1 + E 1 <H(A 1 ) p Q 1 -E 1 <I(A 1 )C 1 ) p 
0^Q 2 + E 2 < H(A 2 ) P Q 2 -E 2 < I(A 2 )C 2 ) P (5.13) 

Qi + E 1 + Q 2 + E 2 < H(A 1 A 2 ) P 



are achievable for quantum transmission with rate-limited entanglement assistance 
through J\f. In particular, if we allow E 1 and E 2 to be maximized (corresponding to 
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fully entanglement-assisted coding), we get a quantum version ofMarton's region: 

Qi<^i;Ci) p 
Q 2 < l -I{A 2] C 2 ) P 
Qi + Q 2 < \l{Au d) p + h{A 2 - C 2 ) p - ±I(A i; A 2 ) p . 

Proof. The proof is little more than applying the previous theorem together with 
the fully quantum AEP (Theorem 2.4). Consider using the previous theorem 
on N® n with input distribution <r® n and with transmission and entanglement 
consumption rates Q\, Q 2 , E 1 and E 2 . Let Ri and Mi be systems of dimension 
2 n Qi / with Mi representing the quantum information that Alice wants to send 
to Bob 1, with Ri being the system that purifies it. Furthermore, let A\ and 
B 1 be systems of dimension 2 nEl representing Alice's and Bob l's halves of the 
preshared entanglement. Replicate all these definitions with subscript 2 for Bob 
2. Then, we define = ® RlMl ® $ AlBl and ^2 = ^ RiM2 ® $ A2 ' B2 , where M1A1 
and M 2 A 2 play the roles of A 1 and A 2 from the previous theorem. 

To get an error that goes down to zero as n — > 00, we need to ensure that 5 CQC , 
5 1 and 5 2 all go down to zero as n — > 00. By the fully quantum AEP and using 
the fact that H^axiM)^ = n(Qi + E ± ) and H max (A 2 )^ 2 = n(Q 2 + E 2 ), S enc goes 
down to zero if 

Q 1 + E 1 <H(A 1 \A 2 ) p 
Q 2 + E 2 < H{A 2 ) P . 

Likewise, using the fact that i? m in(^4i|-Ri)vi — n {E\ — Qi) and H min (A 2 \R 2 )^ 2 = 
n(E 2 — Q 2 ), we get that Si goes down to zero if 

Qi-Ei< H{Ai\EDA 2 C 2 ) p = I{Ai)Ci) p 
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and 5 2 goes to zero if 

Q 2 - E 2 < H{A 2 \EDA l C 1 ) p = I(A 2 )C 2 ) P . 

By switching the roles of Bob 1 and Bob 2, we can also get any rate in the region 

Qx + E x < H(A 1 ) P Qi-E x < /(A^d), 

Q 2 + E 2 < H(A 2 \A 1 ) p Q 2 -E 2 < I(A 2 )C 2 ) p . 

Taking convex combinations of points in these two regions (which corresponds 
to timesharing between different protocols) yields the region in the theorem 
statement. 

To get the fully entanglement-assisted region, we simply take linear combi- 
natinons of the various inequalities to get constraints only on Q 1 and Q 2 : 

Qi < \h{Ai) p + ^(A)Ci) P = \i(A i; d), 
Q 2 < \h{A 2 ) p + 1 -I{A 2 )C 2 ) p = \l{A 2 -C 2 ) p 
Q l + Q 2 < 1 - [H(A 1 A 2 ) P + I(Ai)Ci) p + I(A 2 )C 2 ) P ] 

= l - [H(AiA 2 ) p - HiA.lC,), - H(A 2 \C 2 ) P ] 

= l - [H(A 1 ) P + H(A 2 ) P - I(A i; A 2 ) p - if (Axld), - H(A 2 \C 2 ) P ] 

= \ [7(A; d) p + I(A 2 ; C 2 ) p - I{A x] A 2 ) p ] . 

□ 

5.4 Regularized converse 

The rate region for the case given in Theorem 5.3 is indeed the capacity for 
quantum transmission with rate-limited entanglement assistance of quantum 
broadcast channels provided we regularize over many uses of the channel. It is 
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important to remember, however, that regions defined by very different formu- 
las can nonetheless agree after regularization, so the following theorem should 
be understood to be only a very weak characterization of the capacity 

Theorem 5.4. The capacity region for rate-limited quantum transmission of a quan- 
tum broadcast channel j\f A '^ c ^ is the convex hull of the union of all rate points 
(Qi, Q 2 , E 1: E 2 ) satisfying 

Qi + £1 < -H(A 1 ) i> Qi-E^ -I(A I }C^) 1P 

n n 

Q 2 + E 2 ^ -H(A 2 ) f Q 2 -E l ^ -I{A 2 )C r 2 % (5.14) 

Q 1 + Q 2 + E 1 + E 2 <: -H(A x A 2 )i, 

n 

for some state of the form ^^Mc^c^de™ _ jj®n^A 1 A 2 A ,n d ^ |^ is apure state. 

Proof. It is immediate from Theorem 5.3 that the region is achievable. We now 
prove the converse. 

Suppose that (Qi,Q 2 , Ei, E 2 ) is an achievable four-tuple. That means that 
there exists a sequence of codes of length n with these rates and with error rate 
going to as n — > oo. Consider the code of block size n in this sequence. Let ip = 
$*iMi g^AiBi ^r 2 m 2 ^a 2 b 2 be the state as in Theorem 5.3, £^m 2 ma 2 ^a>™ 
be the encoding superoperator, and let p RiR2C^c^B 1 B 2 E" = jj®n . We will 

evaluate entropic quantities with respect to p. 

Given that Bob 1 must be able to recover a system which purifies R\ from C™ 
and Bi, we have by Fannes' inequality (Theorem 1.9) and the monotonicity of 
the mutual information (see Section 2.4.2) that I{R\\ Cf -E^) > 2nQ x — n5 n , where 
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S n — > as n — > oo, and likewise for Bob 2. We also have 

2nQ 1 -n6 n ^I(R 1 ;C?B 1 ) 

= H(Ri) + H(C™Bi) - H(R 1 C^B 1 ) 
*C H(R 1 ) + H(C?) + H{Bi) - H(R 1 C^B 1 ) 
= nQ 1 + nE x + H(C?) - H(R 1 C[ l B 1 ) 
= nQ 1 + nE 1 + I(R 1 B 1 )C[ l ) 

where the second line follows from subadditivity, and the third line from the 
definition of R\ and B\. Hence, if we identify R\B\ as A-y and likewise for Bob 2, 
we get 

Q 1 - J E? 1 <i/(A 1 )C?) + <J n (5.15) 

Th 

Q 2 -E 2 ^-I(A 2 )C2) + 5 n (5.16) 
n 

where S n ->■ as n ->■ oo. Since Qi + £i = ^.HYAi), Q 2 + £ 2 = i i/(A 2 ), and 
if(AiA 2 ) = H(Ai) + if(A 2 ) by construction, this rate point is clearly inside the 
region in Equation (5.14), and it follows that this is indeed the capacity of the 
channel. □ 

While one might conjecture that Theorem 5.4 characterizes the capacity re- 
gion of a broadcast channel for quantum transmission with rate-limited entan- 
glement assistance even with the restriction n — 1, this is false even in the special 
case of unassisted quantum transmission through a channel with a single re- 
ceiver [DSS98]. It may however be the true capacity for the fully entanglement- 
assisted case, but there is no reason to believe that this would be any easier to 
prove than to prove that Marton's region is optimal in the classical case. 
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5.5 Single-letter example 

In the classical case, the simplest example of a broadcast channel for which 
Marton's region is optimal is a deterministic channel, i.e. a channel where the 
outputs are completely determined by the inputs. Similarly, we can show 
that our rate region is optimal for entanglement-assisted quantum transmission 
through classical deterministic channels. This is perhaps unsurprising since en- 
tanglement would be highly unlikely to help classical transmission through a 
classical channel, but it nonetheless provides an example for which our theorem 
is optimal. 

We say that J\f A '-> ClC2 is a classical deterministic broadcast channel if there 
exist two deterministic functions j\ : { 1 , . . . , | A' | } — > {1, ...,|Ci|} and f 2 : 
{1,...,|A'|} {1,...,|C 2 |} such that U«\i) = |/!«) Cl ® \f 2 (i)) c * ® \i) E for 
some fixed orthonormal bases on A', C lr C 2 and E. We claim that any rate 
point that can be achieved for such a channel is a convex combination of rates 
that can be achieved via our coding method with input states of the form 

<p AlMA ' = El='Jpd/iWX/iW| Al ® l/2«)(/ 2 W| A2 ® \i)(i\ A ' for some probability 
distribution {p-i}. To prove this, we first need the following observation: 

Lemma 5.5. Let f : {1,...,|£)|} — > {1,...,\B\} be a function, and \i) ABCD be 
J2i a i\f i i) A ® \ f(i)) B ® \ u i) C ® V) D ' where and are any pure states, and \i) 
and \f(i)) represent i and f(i) encoded in a standard bases on D and B respectively. 
Then, I (A; B)t < H{B) i . 

Proof. The lemma follows from the observation that because of the structure of 
^ AB and strong subadditivity (see Section 2.4.2), H{B\A) ^ H(B\D). The latter 
is a classical conditional entropy and is therefore never negative, which means 
that I(A; B) ( = H(B)^ - H{B\A\ < H{B) V □ 

Armed with this, we can now show the following: 

Theorem 5.6. Let J\f A '^ c ^ be a classical deterministic channel. Then, the capacity 
region for entanglement-assisted quantum transmission on this channel is the same as 
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the achievable rate region given by Theorem 5.3. 

Proof. According to the regularized converse theorem (Theorem 5.4), for 
any achievable rate point (Qi,Q 2 ), there exists a state \^ A ^ A -z c i cl i EnD = 

jjm^A.A^D such ^ Qi = _L J(Ai . C n^ + Q 2 = _L J(A2 . C n^ + ^ wherg 

8 n > 0, and I{Ai, A 2 )$ = 0. Let C lti and C 2>i be the ith copies of C\ and C 2 in Cj 1 
and C™, and, for each i, let ^ A ^° 2 = ^ jfc \jkjk)(jk\4> c ^ c ^\jk)(jkjk\, where 
the (jkjk\\jk) are defined in the classical basis on and C 2ji and in some fixed 
basis on Ai, A 2 , C\ and C 2 . Then, we can bound the individual rates as follows: 



Q 1 <:—I(A 1 ;C^ + 5 n (5.17) 
«S ^(Cih + S n (5.18) 
< ^J]tf(C M )v> + <*n (5.19) 

= ^X)^(CiU + <J„ (5.20) 

i 

= -Y,h(A 1 ;C 1 ) 1 p i + 8 n (5.21) 

i 

and likewise for Q2- The second inequality is due to Lemma 5.5, with the roles 
of the B and D subsystems in the lemma played by C" and i? n respectively, and 
the third inequality makes use the subadditivity of the von Neumann entropy. 
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We can now do the same thing for the sum rate: 

= i- {H(A^ + H(A 2 )^ - H(A 1 \Ci)^ J - H{A^ C^} + 25 n 
<: {H(A 1 A 2 )^ - H(A 1 A 2 \C^C 2 r %} + 26 n 
= ±I(A 1 A 2 ;C?C%)i, + 26 n 

< ^HiCfC?)* + 25 n (5.22) 
^ 2n ^ H(Ci^C 2)i )^ + 25 n 

i 
i 

i 

= - \ Ci)^ + 7(4,; C7 2 )* - 7(A i; A 2 )*} + 25 n 

% 

where, in the first inequality, we have made use of the fact that A 1 and A 2 are 
independent and of the standard inequality H(AB\CD) < H{A\C) + H(B\D), 
and the last equality follows from the special form of the ipi's. 

Since every % in equations (5.21) and (5.22) corresponds to a rate which is 
achievable via Theorem 5.3, this concludes the proof. □ 

5.6 Discussion 



We have exhibited and analyzed a new protocol for quantum communication 
with rate-limited entanglement assistance through quantum broadcast channels. 
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Our protocol achieves the following rate region for every mixed state a AlMA : 
0^Q 1 ^h(A 1 ;C 1 ) p 

0^Q2^ll(A 2 ;C 2 ) p (5.23) 
Qi + Q2 < \ [I{Au d) p + I(A 2 ; C 2 ) p - I(A 1 ; A 2 ) p ] 

where p a ^c 2 e = u^c^e . g a^ai ^ 

The corresponding rate region (Equation (5.13)) is very similar to Marton's 
region for classical broadcast channels (Equation (5.1)) [Mar79]; except for the 
factors of 1/2, the two expressions are formally identical. In fact, for classical 
channels, the rates for entanglement-assisted quantum communication found 
here can be achieved directly using teleportation between the senders and the 
receiver, with the classical communication required by teleportation transmitted 
using Marton's protocol. From this point of view, our results can be viewed as a 
direct generalization of Marton's region to quantum channels. 

Therefore, once again, it is the entanglement-assisted version of the quan- 
tum capacity that bears the strongest resemblance to its classical counterpart. 
The same is true for both the regular point-to-point quantum channel [BSST02] 
and the quantum multiple-access channel [HDW08, HOW07] and, of course, the 
quantum channels with side information at the transmitter that were discussed 
in the last chapter. In both those cases, the known achievable rate regions for 
entanglement-assisted quantum communication are identical to their classical 
counterparts. This collection of similarities suggests a fundamental question. To 
what extent does the addition of free entanglement make quantum information 
theory similar to classical information theory? 

Of course, the lack of a single-letter converse for Marton's region and, by ex- 
tension, for our region, leaves open the possibility that the analogy might break 
down for a new, better broadcast region that remains to be discovered. A first 
step towards eliminating that uncertainty could be to find a better character- 
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ization of the quantum regions we have presented here. The presence of the 
"discarded" system D in Theorem 5.3 is equivalent to optimizing over all mixed 
states (f) AlA2A ' rather than only over pure states. This is not required for most 
theorems in quantum information theory but we have not found a way to prove 
the regularized converse without allowing for the possibility of mixed states. We 
leave it as an open problem to determine whether it is possible to demonstrate a 
converse theorem that does not require allowing mixed states. 

Finally for the unassisted case, it is very interesting to note the absence of an 
independent constraint on the sum-rate. However, we already know that this 
region is suboptimal even for channels with a single receiver. It would therefore 
be desirable to know whether this holds for the true capacity region and whether 
there is an underlying principle that explains this phenomenon. 



CHAPTER 6 



LOCKING CLASSICAL INFORMATION IN QUANTUM STATES 

One particularly shocking feature of quantum information is the "informa- 
tion locking" effect that one sometimes observes. At the general level, it consists 
of a system in which one encodes classical data into a quantum system with two 
parts, one part being a very large "cyphertext", and the other being a very small 
key The strange phenomenon is that it is possible to set up the system in such a 
way that, given the large portion, one can get almost no information about the 
classical data by measuring the cyphertext, whereas the key allows one to "un- 
lock" this information. This may seem at first somewhat unsurprising, since this 
is what classical cryptographic systems aim to do, but it must be stressed that 
this is at the information theory level: the distribution on the classical data given 
the large portion is almost the same as the prior distribution even if the key is 
much smaller than the message. Classical encryption cannot achieve this at all: 
the distribution on the message given the cyphertext is vastly different from the 
prior distribution unless the key is as large as the message. 

Information locking schemes have already been shown to exist by [DHL+04] 
and [HLSW04]. In [DHL+04], the authors construct a scheme by encoding the 
classical information in one of two mutually unbiased bases, and the one-bit 
classical key simply tells which basis it's encoded in. Without the key, the Shan- 
non entropy about the message is approximately half of the entropy of the mes- 
sage; the key therefore increases the information the receiver has by the same 
amount. 

In [HLSW04], the authors look at a protocol where one encodes classical in- 
formation in the computational basis, and then applies one of a few (logarithmic 
in the number of possible messages) fixed unitaries. The classical key tells which 
unitary was applied. If the unitaries are chosen according to the Haar measure, 
then locking occurs with high probability. 
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In both of these papers, locking was defined in terms of the accessible informa- 
tion between the cyphertext and the message, which defined as follows: 

Definition 6.1 (Accessible information). Let p AB e D(A <g> B) be a quantum state. 
Then, the accessible information I acc (A; B) is defined as 

h cc (A; B) p := sup /(X; Y) {A<S)B){p) , 

A,B 

where A A ^ X and B B ^ Y are measurement superoperators, and the supremum is taken 
over all possible superoperators. In other words, the accessible information is the largest 
possible mutual information between the results of measurements made on A and B. 

Locking was said to occur when the difference in accessible information with 
and without the key was larger than the size of the key Here we will instead use 
the trace distance between the joint distribution of the measurement results and 
the message and the product of their marginals. This will imply a bound on the 
mutual information via the Alicki-Fannes inequality (Lemma 1.10). 

We now give the formal definition of locking that we will use: 

Definition 6.2. Let C and K be two quantum systems. We call a set of quantum states 
{Pm K '■ m ^ {!)••-, N}} an e-locking scheme if \\pf K — pf K \\i = 2 whenever i ^ j, 
and for any complete measurement superoperator M c ^ x , we have that 

where u MC = J2iLi \i)(i\ M ® P? and n c denotes the completely mixed state on C. 

In other words, a set of states is a locking scheme if the states are perfectly 
distinguishable when one has both the cyphertext C and the key K, whereas 
a measurement on C alone yields practically no information about which state 
was present. Note that the restriction to complete measurements is a natural one, 
since, if the goal is to maximize the information about the message, keeping a 
quantum residue is of no use: it can never hurt to measure it until nothing is left. 
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Note that it is impossible to achieve this classically without making the key 
almost as long as the message. One can see this by considering that fact that, 
if one only needs to know an extra log K bits to reconstruct the message, our 
probability distribution of the message given the cyphertext must always be 
supported on at most K distinct messages; such a distribution must necessarily 
be far away from the uniform distribution over all messages unless K is nearly 
equal to the number of messages. 

The scheme we will construct here is a special case of this model. We consider 
a scheme where we encode classical information in the computational basis of 
a quantum system, apply a fixed unitary, and split the system into two com- 
ponents, a large one (C) that becomes the cyphertext, and a small one (K) that 
becomes the key. 

Note also that an ^-locking scheme also automatically implies locking of the 
accessible information: 

Lemma 6.1. Let {p^ K : m e {1, . . . , N}} be an e-locking scheme, and let u MC = 
jtZLm^^pf-Then, 

/acc(M; C% < e log N + 2 V (1 - e) + 277(e), 

where i](x) := — x log x and 77(0) = 0. 

Proof. Direct application of the Alicki-Fannes inequality (Lemma 1.10). □ 

6.1 The locking scheme 

Our information locking scheme is straightforward. To encode N equiprob- 
able classical messages, we embed them via a random partial isometry into a 
system CK of total dimension at least iV; C constitutes the cyphertext and K 
constitutes the key. The key is therefore itself a quantum state; if one prefers a 
scheme with a classical key as was done in [DHL + 04, HLSW04], one can sim- 
ply perfectly encrypt the quantum key with a 2 log K-bit classical key and make 
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the encrypted quantum key part of the cyphertext; the classical key is then the 
locking key. 

To prove that this works, let {\ip m ) : 1 < m ^ N} be any set of orthonormal 
pure states in CK. We would like to prove that there exists a U CK such that 
{U CK \ip m )} is a good locking scheme. To do this, we will consider the state 

pMCK _ i_ Y^ =1 \m){m\ M <g> \ip m ){ip m \ CK and the expression 

\\M(Tr K [U-p MCK })-M(7r c )®p M \\ 1 

for a U CK chosen according to the Haar measure and an arbitrary Ai. We will 
show that the average is sufficiently small and that the distribution is sufficiently 
concentrated around the mean value to ensure that there exists a U that makes 
this expression small for every M. 

Theorem 6.2. Let p MCK be a state of the form p MCK = YZ=i \m)(m\ M <g> pg K . Then, 
there exists a U CK such that for every measurement superoperator M c ~* x , 

\\M(Tr K [U ■ p MCK \) -M(7i c )^p M \\ 1 <:7e 

as long as ^fl,e^ e - 2 ,and \K\ > f y^og (^) ln(l/e). 

To prove this, we will first consider M's of a very specific form that will 
allow us to take a union bound: 

Definition 6.3 (Quasi-measurement). We call a superoperator M c ^ x an (n, k)- 
quasi-measurement if it is of the form M(cr) = ^ X)x=i \ x ){i J x\o'\^x)(x\ where the 
\x) are orthonormal, and ^ Y11=i IV'xXV'xI ^ kl c '. 

The starting point will be the following concentration of measure result: 
Lemma 6.3. Let M be an (n, k) -quasi-measurement. Then, 

Pr { \\M(Tr K [U ■ p MCK }) - M(n c ) ® p M \\ x > 25 + r | 

< 2e _7v2r2/16fe2 
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Proof. The lemma is a direct application of Theorem 3.9. We first show that 

max{||M(Tr^[X])||i : X G Herm(A), ||A"||i < 1} < k. Let X G Herm(A); then, 



n 

\C\ 
n 

\C\ 
n 

\C\ 
n 



J2\x)(i> x \Tr K (X)\i> x )( 

X 

^|(^|Tr^(X)|^)| 

X 

J2M\Tr K (X)\\i; x ) 



Tr 



J2\i> x )(i> x \\Tr K (X)\ 



where the first inequality follows from the matrix inequality — ]F| ^ F ^ |F|, 
which holds for any Hermitian F. Next, let co c ' K ' x = M(Ti K [§ c ' K ' CK ]); we will 
show that H 2 (C'K'\X) 0J > — log k + log \K\. Since .M is a quasi-measurement, u 

has the form u c ' K ' x = Y2=i ^x\x){x\ x <g> n K ' <g> OJ) C ', where a x = ^ (ip x \n\r(> x ) . 
Then, (w x ) -1 / 4 = J2 X otx 1 ^\x)(x\, and we have that 



2-h 2 (Ck'\x) u , < Tr 



= Tr 



{uj x )- 1 / a u c ' k ' x {u x )- 1 / a 



"£ax\x)(x\®(n K ') 2 ®(^y 

X 
X 



1 



fc 

LK1 



Tr 



We combine this with the fact that H 2 {CK\M) P = to get the lemma. 



□ 
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At this point, we would like to take a union bound over all possible quasi- 
measurements to be able to say that there is a nonzero probability that the trace 
distance above is small for every quasi-measurement M. We do this by intro- 
ducing an £-net (see Definition LI) over C, with | OTj < (f) 2 ' C '- For any (n, k)- 
quasi-measurement M c ^ x of the form M(a) = J2 x a x \x)(ip x \a\ip x ){x\, define 
M<n as Myi(a) = J2 X a x \x)(ip' x \a\ip x ){x\, where \ip' x ) is the state in 91 closest to 

\i>x)- 

Given a sequence (\(p x ) : 1 ^ x ^ n, \(p x ) £ 9?), we say that is it e- 
close to an (n, A;)-quasi-measurement if there exists an (n, /c)-quasi-measurement 
M(a) = ^YZ=i \ x ){^x\cr\^ x ){x\ such that \\^ x - yxji ^ e for all x. Fur- 
thermore, given a sequence q = {\ip x ) : 1 ^ x ^ n}, we define M^ x as 

-^» = ^e:=i \x)m<j\^ x )(x\. 

We can now take the desired union bound: 

Lemma 6.4. Let H C 9T n be the set of all sequences of n elements o/9? that are e-close 
to an (n, k) -quasi-measurement. Then, 

Pr jag G £2 : ||AL,(Tr K [[/ • p MCi ^]) - A^,(tt c ) <g> p M || x ^ 2^ log M lo g 1^1 + 2e + rj 

< 2 e 2riln(5/e)|C|-Af 2 r 2 /16fc 2 

and therefore, as long as 2nln(5/£)|C| — N 2 r 2 /16k 2 < —In 2, there exists a U such 
that all q e satisfy the above inequality. 

Proof. l,ei M q (a) = £ x \x){^ x \a\^ x ){x\ and let A^'(o-) 

XL l^^xklV'x)^! be an (n, A;)-quasi-measurement that is e-close to q. 
Furthermore, let £, x = ip x — ip' x ; clearly, for each x, ||£ x ||i ^ £• Then, given any 
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cq-state C with (° = ir c , we have that 



\\M(( MC - ( M ® 

C 
n 



\C 
n 

\C_ 
n 

\C 
n 

\C 
n 



Ell^^c^-c^nilL 

x=l 
n 

x=l 

n 

E (\\ T *cl€(( MC - C M ® C )]^ + ||Tr c ^(C MC - C M ® C C )]U 



x=l 
n 



E (l|Trc[^(C MC - C M ® C^)] |L + IITM&C^IL + ||Trcfe(C M ® C^HJ 

x=l 

E(||Trc[^(C MC -C M ®C C )]|| 1 + ^) 

x=l ^ 11/ 

= ||^'(C MC7 -C M ®C c )|| 1 + 2e 

Now, we have that |0| < \Vl n \ ^ (f) 2n ' q Hence, by the union bound and 
Lemma 6.4, we get the lemma. □ 

We need to use this to get a bound on general measurement superoperators. 
The idea will be to imagine that, given any measurement operator, we perform n 
independent measurements on n i.i.d. copies of p. The operator Chernoff bound 
(Lemma 1.8) will then ensure that the resulting sequence of measurement results 
is an (n, /c)-quasi-measurement with high probability. 

Lemma 6.5. Let M c ^ x be any complete measurement superoperator, with M(n) = 
J2 X a x\x)(ip x \iT\ipx)(x\, and consider the operator-valued random variable Y which 
takes the value \ifj x ){ip x \ with probability a x (if} x \n\if} x ) = a x /\C\. Then, n i.i.d. 
copies of Y will fail to be an (n, k)-quasi-measurement with probability at most 

2\(J\ e -n(k-l) 2 /\C\2\ii2 



Proof. Y fulfills all the conditions for the operator Chernoff bound (Lemma 1.8) 
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to apply, with EY = 7r . This yields 

Pl \lit Y ^ for J ^ 2\C\e- n(k -^^ 2la2 

and the probability on the left is an upper bound on the probability that the 
sequence Y 1: . . . , Y n is not an (n, /c)-quasi-measurement. □ 

Putting all the pieces together, we finally get the main theorem of this section: 

Theorem 6.6. There exists a U CK such that for all measurement operators M c ^ x , 

\\M(Tr K [U ■ p MCK ]) - M(n c ) ® p M \\ 1 ^ 7e 



as long as e ^ e~ 2 , and \K\ > f Jlog (^f) ln(l/e). 

Proof. Let Ai°^ x be any complete measurement superoperator of the form 
M.(a) = J2 X a x\x){'4>x\o'\ipx){x\, and define Y to be the operator-valued RV which 
takes value ip x with probability a x /\C\. Let Q be the event that Y 1 , . . . , Y n is an 
(ra, A;)-quasi-measurement, where the Y { are i.i.d. with the same distribution as 
Y. Now, assuming U fulfills the requirements of Lemma 6.4, we have that 



1 1 M {Ti K [U ■ p MCK ] ) - M (7r c ) ® p M 1 1 1 
= E^||Tr c [^(Tr x [f/-p 



|C|E y \\Tr c [Y(Tr K [U ■ p 

\C 



MCK] a ^ M 



— 7T 



P M )]|ll 



= ^Yr,.,Y n E l|Tro[^(Tr^[C7 • p^] - ^ g) p^)]^ 



|C 
n 



i=i 



Pr{Q}E 



J2\\^c[Y t (Ti K [U 



P 



MCK] C ^ „M 



-vr^p-)]^ 



i=i 



g 



IC 



+ ^Pr{g}E 



j2\\^cmTr K [u ■ p 



MCK] C ^ „M 
— 7T 



P M )]|ll 



g 



^ 2 §iogfc-§io g |^| + 4 £ + r + 4|C| 2 e - n ( fc - 1 ) 2 /|c*|2in2 
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In the above, we have bounded the first conditional expectation using Lemma 
6.4, with the 2e going to As due to the fact that, by definition, any (n, /c)-quasi- 
measurement is e-close to element of £2. The second conditional expectation was 
simply upper bounded by 2n (i.e. each trace distance in the sum cannot exceed 
2) and we used Lemma 6.5 to bound Pr{Q}. 

All that is left to do is to choose the various constants such that 
2nln(5/e)|C| — N 2 r 2 /lQk 2 < — ln2 as imposed by Lemma 6.4, and such that 
4|C| 2 e~ Tt ( fc ~ 1 ) 2 /l c 'l 21n2 ^ e. Setting k — 2 and r = e and doing a few simple com- 
putations yields that this is possible as long as 

N 2 e 2 
n < 

" 512|C|ln(l/ £ ) 

and 

n > 2|C| log ^ — ^ 

£ 

given that iV ^ ^ and e ^ e -2 . It follows that choosing K such that 

suffices to ensure that there exists a such that 

||.M(T^[[/ • p MC *]) - A*(tt c ) ® p M || 1 ^ 7 £ 

□ 

6.2 Implications for the security of quantum protocols against quantum ad- 
versaries 

When designing quantum cryptographic protocols, it is often necessary to 
show that a quantum adversary ("Eve") is left with only a negligible amount 
of information on some secret string. An initial attempt at formalizing this 
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idea is to say that, at the end of the protocol, regardless of what measure- 
ment Eve makes on her quantum system, the mutual information between her 
measurement result and the secret string is at most e (in other words, her ac- 
cessible information about the message is at most e). This was often taken 
as the security definition for quantum key distribution, usually implicitly by 
simply not considering that the adversary might keep quantum data at the 
end of the protocol [LC99, SPOO, NCOO, GL03, LCA05] (see also discussion in 
[BOHL+05, RK05, KRBM07]). In [KRBM07], it is shown that this definition of 
security is inadequate, precisely because of possible locking effects. Indeed, this 
security definition does not exclude the possibility that Eve, upon gaining partial 
knowledge of S after the end of the protocol, could then gain more by making a 
measurement on her quantum register that depends on the partial information 
that she has learned. In [KRBM07], the authors exhibit a (somewhat contrived) 
quantum key distribution protocol which generates a secret n-bit key such that, 
if Eve learns the first n — \ bits, she can then learn the remaining bit by measuring 
her own quantum register. 

The locking scheme presented above allows us to demonstrate a much more 
spectacular failure of this security definition. We will show that there exists a 
quantum key distribution protocol that ensures that an adversary has negligi- 
ble accessible information about the final key, but with which an adversary can 
recover the entire key upon learning only a very small fraction of it. 

6.2.1 Description of the protocol 

We will derive this protocol by taking a protocol that is truly secure, and 
then making Alice send a locked version of the secret string directly to Eve. 
We will be able to prove that regardless of what measurement Eve makes on 
her state, she will learn essentially no information on the string, but of course, 
she only needs to learn a tiny amount of information to unlock what Alice sent 
her. More precisely, let P be a quantum key distribution protocol such that, 
at the end of its execution, Alice and Bob share an n-bit string, and Eve has a 
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quantum state representing everything that she has managed to learn about the 
string. We will also assume that P is a truly secure protocol: the string together 
with Eve's quantum state can be represented as a quantum state a SE such that 
\\<7 SE — n s (g) (t e \\i ^ e, where 5* is a quantum register holding the secret string, 
and E is Eve's quantum register. Now, we will define the protocol P' to be 
the following quantum key distribution protocol: Alice and Bob first run P to 
generate a string s of length n, and then Alice splits s into two parts: the first 



part Sk is of size log ^32/ey'log (^f^) ln(l/£)J, and the second part s c contains 
the rest of the key. Alice then uses the classical key s k to create a quantum state 
in register C that contains a locked version of s c and sends the system C to Eve. 

How secure is P'? It is clearly very insecure, since, if Eve ever ends up learn- 
ing Sk (via a known plaintext attack, for instance), she can then completely re- 
cover s c . However, the next theorem shows that, right after the execution of P' , 
Eve cannot make any measurement that will reveal information about the key. 
In particular, P' satisfies the requirement that Eve's accessible information on 
the key be very low. 

Theorem 6.7. Let P and P' be quantum key distribution protocols as defined as above, 
and let p CES be the state at the end of the execution of P': S contains the n-bit string s, 
E is Eve's quantum register after the execution of P, and C contains the locked version 
of s c that Alice sent to Eve. Then, for any measurement superoperator M CE ^ X , there 
exists a state £ x such that 




\\m(p ces ) -e® ^1^26. 



This also entails that 



I acc (S; CE) <: 2en + 2^(1 - 2e) + 2r 1 (2e) 



via the Alicki-Fannes inequality (Lemma 1. 10). 
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Proof. From the definition of P, we have that 

\\p ES -n^p^l^e. (6.1) 

Now, let C s ^ cs be a superoperator that takes a classical string in S, splits it into 
s k and s c , creates a locked version of s c with s k as the key into the quantum 
system C, and leaves the classical string in S unchanged; this is simply the op- 
eration that Alice performs when preparing C for Eve. The above inequality 
combined with the monotonicity of the trace distance under CPTP maps yields 

\\ p CES_ C ^ S) ^ p E^^ £ (62) 

and hence, for any measurement superoperator M CE ^ X , 

WMip^-MiC^^p^W^e (6.3) 

Consider now the expression M CE ^ X (C(n s ) ® p E ): it can be viewed as a mea- 
surement on the C system of C s ^ cs (n s ) alone that is implemented by creating 
the state p E and then measuring M CE ^ X . Furthermore, note that, by the defini- 
tion of an e-locking scheme, we have that, for every measurement superoperator 

\\M{C{ix s )) -A/-(Tr s [C(7r 5 )]) ® n 8 ^ <: e. (6.4) 
Applying this to M CE ~* X (C(n s ) <g> p E ), we get that 

\\M{C{ix s ) <g> p E ) - M(Tr s [C(7r s )} <g) p E ) ® tt 5 ^ ^ e. (6.5) 

We now use the triangle inequality on Equations (6.3) and (6.5) to obtain 

\\M(p CES ) - M(Tr s [C(ir s )] ®p E )®n s \\ 1 ^2e (6.6) 



which yields the theorem with £ x := M(Tt s [C(ti s )} <g> p E ). 



□ 
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Hence, we have shown that requiring that Eve's accessible information on 
the generated key be low is not an adequate definition of security for quantum 
key distribution. We have exhibited a protocol P' which guarantees low acces- 
sible information and yet is clearly insecure due to locking effects. 

6.3 Discussion 

The essence of the locking phenomenon is that it is possible to possess purely 
quantum information about a classical message: the cyphertext by itself must 
contain a lot of information about the message, since only a tiny key is required 
to get the message, but none of it can be considered classical, since no measure- 
ment succeeds in extracting this information. This phenomenon has particular 
importance in cryptography: it highlights the need to consider an adversary 
having access to quantum memory, since it is possible for a protocol to ensure 
that no adversary has any classical information about a particular string while 
having a lot of quantum information about it. The adversary then needs only 
a very small amount of additional information to unlock his quantum informa- 
tion. This essentially means that security definitions in cryptography must take 
quantum information into account to be composable in the physical world. 

The main improvement of this work over previous locking schemes is the fact 
that locking is defined in terms of a trace distance between measurement out- 
puts rather than in terms of accessible information. This is strictly stronger, and 
has a more compelling interpretation: measurements made on a locked message 
cannot be distinguished with more than negligible probability from data gen- 
erated independently of the message. Furthermore, it demonstrates the failure 
of cryptographic security definitions based on measurement results even more 
flagrantly: previous results [KRBM07] showed that there exists a quantum key 
distribution protocol that produces an n-bit key about which no adversary can 
obtain significant information through a measurement, but for which there can 
exist a quantum adversary who, upon learning the first n — 1 bits of the key, can 
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then learn the last one by measuring his quantum data. In this work, the quan- 
tum adversary only needs to get polylog(n) bits on the key before being able to 
reconstruct the entire key rather than n — 1 bits. 



CHAPTER 7 



CONCLUSION 

In this thesis, we have developed a set of mathematical tools to solve quan- 
tum information theory problems within a unified framework. These tools are 
based on the idea of decoupling: in the quantum world, ensuring that two sys- 
tems are uncorrelated implies that both of these systems are completely corre- 
lated with a third system that purifies the state that they are holding. Hence, the 
problem of information transmission, which can be viewed as the problem of es- 
tablishing perfect correlation between a sender and a receiver, can be solved by 
destroying correlation between the sender and an "environment" system that pu- 
rifies the global state. Chapter 3 presents this concept in detail and gives a gen- 
eral theorem that allows us to ensure that two systems are decorrelated. This 
theorem analyzes the following situation: we have a quantum channel f A ^ E 
and a quantum state p AR , we apply a unitary U on the A system of p (a unitary 
chosen at random uniformly over the set of all unitaries works on average) and 
then we send A into the input of T. The result is that the quality of decorrelation 
only depends on two parameters: one that indicates how easy it is to decorrelate 
the state and the other that measures how good the channel is at decorrelating. 
Several different versions of this theorem are presented to adapt it to different 
uses. 

The rest of the thesis then goes on to apply these tools to more concrete infor- 
mation theory problems, allowing us to obtain new theorems as well as many 
of the most important theorems in the field, often in a more general form. These 
include the best known achievable rate for quantum transmission through quan- 
tum channels and the entanglement-assisted capacities of quantum channels for 
classical and quantum transmission. It also allowed us to come up with hitherto 
unknown coding theorems on quantum channels with side-information at the 
transmitter, as well as quantum broadcast channels. 
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In all of these cases, the coding theorems followed the same pattern: we first 
obtain a theorem that applies to a single use of a channel, with the quality of 
transmission depending on various min- and max-entropies. We then specialize 
these theorems to the case where the "single channel" in question is actually 
n copies of the same channel, yielding an asymptotic result. In this process, 
the min- and max-entropies are bounded using the fully quantum asymptotic 
equipartition property, and turn out to become von Neumann entropies. 

We end the thesis in Chapter 6 with an application of decoupling of a slightly 
different flavour: locking classical information in quantum states. This involves 
encoding a classical message into two quantum systems: a large one (the cypher- 
text) that is almost as large as the message itself, and a very small one (the key). 
The encoding has the property that, given only the cyphertext, no measurement 
can yield any significant amount of information about the message, even though 
the cyphertext and the key together provide full information about the message. 
In contrast with previous work on locking, the definition of locking used here 
involves a trace distance between two classical distributions: results of a mea- 
surement made on a locked message, and measurement results generated in- 
dependently of the message; this is both a stronger condition and has the clear 
operational interpretation that a locked message is virtually indistinguishable 
from a random state when a measurement is made. 

7.1 Open problems and future research directions 

There are several open problems and possible research projects that arise out 
of the results presented in this thesis. Here are some of them: 

A constructive version of the locking scheme: The results presented in this 
thesis involve a random unitary chosen according to the Haar measure in some 
way. This yields proofs that certain protocols exist, but does not directly give 
a way of actually constructing them. For almost all of the results in this thesis, 
however, one can replace the Haar measure by a unitary 2-design, since only the 
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second moment of the Haar measure is necessary for the proofs. But there is one 
exception: the information locking scheme of Chapter 6. This is because its proof 
relies not only on the second moment of the Haar distribution, but also on its 
concentration properties (Theorem 3.9). Indeed, Theorem 3.9 states that, in the 
main decoupling theorem (Theorem 3.7), not only do we get good decoupling 
on average when choosing a unitary randomly, but also that this holds with 
overwhelmingly high probability. Statements of this nature abound in quantum 
information theory and its applications are far from being limited to information 
locking: it can be used to show the existence of completely entangled subspaces 
[HLW06], to prove the existence of counterexamples to the additivity conjecture 
[Has09]. In all of these cases (as well as locking), it would be of great interest to 
have explicit, constructive examples. Finding a constructive version of Theorem 
3.9 (or perhaps something slightly more general) would most likely achieve this 
for all of the problems mentioned. 

Min-entropy bounds for larger classes of states: For all of the channel cod- 
ing problems shown in this document, we have proceeded as follows: we first 
gave a general one-shot coding theorem, and then we used it to give a theorem 
for memoryless channels. To do this, we used the fully quantum asymptotic 
equipartition property (Theorem 2.4, [TCR08]) to get a bound on the smooth 
min-entropy of i.i.d. states. If we had a way to similarly bound the smooth min- 
entropy of a larger class of states, we could apply it to a larger class of channels, 
such as various types of channels with memory. 

Optimality of the one-shot coding theorems: The various one-shot coding 
theorems presented were left without converses. However, it seems likely that 
they are, in fact, optimal, at least for some particular input distributions. Some 
special cases have already been shown to be optimal, such as state merging 
[BCR09]. 

Systematically relating classical information theory and quantum infor- 
mation theory with free entanglement: In quantum Shannon theory, it has very 
often proven to be the case that the quantum problems that bear the strongest 
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resemblance to their classical counterparts are those in which the various partic- 
ipants share entanglement before the protocol starts and are allowed to use it to 
improve the performance of the protocol. This is the case for information trans- 
mission through a regular channel: Shannon showed that the mutual informa- 
tion gives the capacity of a classical channel; and it turns out that the quantum 
mutual information characterizes the entanglement-assisted capacity of quantum 
channels. This is also true for channels with side-information at the transmis- 
sion (see Chapter 4) as well as broadcast channels (see Chapter 5). In all of these 
cases, we get essentially identical expressions for the capacities (or achievable 
rate regions) in the classical and in the quantum case. This suggests that there 
might be a general principle at work relating the two. Such a principle would 
allow us to automatically import large classes of results from the extremely vast 
body of work in classical information theory directly into quantum information 
theory. It is not clear at this point, however, to what extent this principle would 
apply, or what the most appropriate definitions would be. 
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Appendix I 



Various technical lemmas 



In this section, we state (and usually prove) various technical lemmas used 
at various points throughout the thesis. 

The first lemma is a simple application of the triangle inequality: 

Lemma 1.1. Let p, p' and a be positive semidefinite operators on A such that \\p—a\\i < 
e, Tr[p'] < Tr[a], and p' > p. Then, \\p' — cr||i < 2e. 

Proof. We have that 



p'-p|| 1= Tr[p'-p] 
< Tr[a - p] 



(1.1) 



(1.2) 



(1.3) 



and hence 



Hp' - < Hp - + Hp' - pIIi < 



(1.4) 



□ 



We then prove the following operator inequalities: 



Lemma 1.2. Let p AB be positive semidefinite, and let < P B < I B . Then, 



Ti B [P B p AB P B ]^p A 
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Proof. Let M A be any positive semidefinite operator. Then, 

Tr[M A Tr B [P B p AB P B ]] = Ti[{M A <g> I B ){P B p AB P B )] 

= Tr[(M A <g> P B2 )p AB ] 
^ Tr[(M A <g> I B )p AB ] 
= Tr[MV] 

where we have used the fact that tensoring with the identity is the adjoint of the 
trace superoperator, as well as the fact that P b2 < I B . Since this is true for every 
positive semidefinite M A , the lemma follows. □ 

Lemma 1.3. Let \ip) AB e A <g> B, p A e Pos(A) such that p A < ip A . Then, there exists a 
P B G Pos(B) such that P B <: I B and Tr B [P B • = p A - 

Proof. Without loss of generality, let A and B be equal to the support of ^ A 
and ip B respectively. Define the partial isometry V B ^ A = ip A _1 ^ 2 op B ^ A (|-^)) = 
oPb^aH^))^ where the T subscript denotes transposition. Now, 

p A = VV ] pVV ] 

= oP B .a(W)^ _1/ VW _1/2 °Vb^aW) 

= o PB ^M))v^ A ~ 1/2 pr~ 1/2 v o Pb ^M)) 

= op B ^ A (mpfo PB ^ A my 
= °PB^A(p B m°PB^A(p B \^y 

= Tr B [P B -ijj AB } 

where we have defined P b2 = V^ A ~ 1 ^ 2 pip A ~ 1 ^ 2 V e Pos(B) and the T subscript 
denotes transposition. We can now easily check that P^ < I B since p < ip 
implies that tp A ~ 1/2 p?p A ~ 1/2 < I A . □ 

The following lemma comes from Lemma II.4 from [HLSW04]: 
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Lemma 1.4. Given two normalized vectors \ip) and \ip) in A, we have that 

1^-^11^2111^-1^)112 

Proof. By Lemma 3.6 with a as the projector onto the 2-dimensional support of 
tjj — ip, we have that 

= 2y/l- Tr[y^] 
<2v/2-2|(^)| 

= 2y/m-(<p\)(\1>)-\<p)) 

= 2\m-\ V )h 

□ 

The next two lemmas are simple inequalities regarding operator norms: 
Lemma 1.5. Let M A ~^ B and N B ~^ C be arbitrary matrices. Then, 

\\NM\\ 2 ^\\N\\ 2 \\M\U 

Proof. Let U B ~^ A be an isometry such that P B :— MU is positive semidefinite 
(such an isometry can be seen to exist by taking the singular-value decomposi- 
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tion of M). Then, we have that 



||JVM|| 2 = \\NP 



(1.5) 



(1.6) 



< llPlU^/TVpVTVt] 
= l|M|U||iV|| 2 



(1.7) 



(1.8) 



where the inequality comes from the matrix inequality P 2 ^ 



□ 



Lemma 1.6. Let M A ^ B be an arbitrary matrix. Then, 

||M||i = max |Tr[T/M]| 

where the maximization is taken over all partial isometries V B ^ A . 

Proof. Let us decompose M as M = Yl ajlV'jXVjl where the are orthonor- 
mal, as are the \tpj) A , and the ctj are the singular values of M. Furthermore, let 
lyB^A a p ar tial isometry such that W\ipj) = \tfj). Then, 






\\M\U 



□ 



The next lemma is simply Markov's inequality, which we use several times 
to assert the existence of a unitary satisfying many conditions at once: 



XVII 

Lemma 1.7 (Markov's inequality). Let X be a random variable which is always posi- 
tive. Then, 

Pi{X > kEX} ^ \ 
Hence, for example, if fx, . . . f k : U — > R + , then, there exists a U such that 

h{U) < (k + l)E/i(C/) 
f k (U) < (k + l)Ef k (U) 



by the union bound. 

The next lemma is known as the operator Chernoff bound and was first 
proven in [AW02]: 

Lemma 1.8 (Operator Chernoff bound). Let X ± , . . . , X M be i.i.d. random variables 
taking values in the operators Pos(A), with ^ X 3 ^ I, with A = EXj > al, and let 
< 7] < 1/2. Then 

Pr { h E X ^ (! + ^} < 2 I^I ex P (- M 2lr^) • (L9) 

We also need Fannes's inequality [Fan73] as well as its relative, the Alicki- 
Fannes inequality [AF04]: 

Lemma 1.9 (Fannes's inequality [Fan73]). Let p and a be density operators on A such 
that \\p — a\\i < 1/e. Then, 

\H{A) p - H(A) a \ <: \\p - log \A\ +r)(\\p- <t\\i) 



where r)(x) 



= —x log x and e is the base of the natural logarithm. 
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Lemma 1.10 (Alicki-Fannes inequality [AF04]). Given two states p AB e D(A <g> B) 

and a AB e D(A ® B), w;z'f/z ||p AB - a AB \\i = e, the following holds: 

\H(A\B) P - H(A\B) a \ <: 4e\og \A\ + 277(1 - e) + 2r)(e) 

where r] is defined as above. 

The locking chapter needs the concept of e-nets. The following definition 
and lemma were taken from [HLSW04], but these concepts are used rather ex- 
tensively in other areas of mathematics, particularly in random matrix theory 

Definition LI (e-net). A set of pure states 9? C A is called an e-net if, for every 
normalized^) e A, there exists a \tp) e ^Slsuchthat — |v)||2 ^ e/2and </?||i < 
e. 

Lemma 1.11 (Existence of small nets). For any Hilbert space A of dimension \A\, 
there exists an e-net 9t C A of size |9t| < (f) 2 ' A '- 



